Digital Analysis of Circulating Tumor Cells in Blood Samples

ABSTRACT

This disclosure relates to new assay methods for analysis of circulating tumor cells (CTCs) in blood samples for detection, e.g., early detection, and/or monitoring of disease, e.g., cancer. The methods provide ultra-high sensitivity and specificity, and include the use of microfluidic isolation of CTCs and digital detection of RNA derived from the CTCs.

TECHNICAL FIELD

This invention relates to blood sampling techniques, and more particularly to methods and systems for detecting and analyzing cells in blood samples.

BACKGROUND

The ability to detect the presence of rare circulating tumor cells (CTCs) using a simple blood test, or “liquid biopsy,” has the potential to greatly enhance the monitoring of epithelial cancers, providing instant sampling of tumor cell numbers, genetic composition, and drug response parameters, without requiring invasive tumor biopsies. Thus, the detection of CTCs for early cancer detection has the potential to revolutionize the treatment of cancer, enabling the diagnosis of invasive cancer at a stage before it has metastasized, when curative treatment is expected.

However, CTCs are very rare, and identifying, visualizing, and scoring these tumor cells admixed with normal blood components remains a significant challenge, even after partial purification with known microfluidic devices or similar technologies. For example, per milliliter of whole blood, there are only 1-10 CTCs amongst more than 5 billion red blood cells (RBCs) and more than 5 million white blood cells (WBCs)(Plaks et al., “Cancer Circulating Tumor Cells,” Science, 341:1186; 2013). In addition, antibody staining of tumor cells is highly variable, due to high heterogeneity among cancer cells, even within an individual patient, as well as the poor physical condition of many tumor cells that circulate in the bloodstream, many of which have begun to undergo programmed cell death or anoikis. In addition, accurate scoring of antibody-stained tumor cells requires differentiation from large numbers of contaminating white blood cells, some of which bind to antibody reagents non-specifically. As such, only a subset of candidate tumor cells can be robustly identified by antibody staining, and as many as half of patients tested have no detectable cells, despite having widely metastatic cancer.

Thus, current protocols for imaging CTCs are seeking higher and higher levels of purity in the isolation of CTCs, especially from other nucleated blood cells, such as white blood cells (WBCs).

SUMMARY

The present disclosure relates to methods, uses, and systems to obtain the highest possible sensitivity of data relating to rare CTCs in standard blood samples, while avoiding the need for extremely high levels of purity of the CTCs. In particular, the new methods do not need the CTCs to be completely isolated from contaminating WBCs, and instead can reliably detect as few as one CTC in products containing, e.g., up to 10,000 WBCs or more. The new assay methods and systems combine (1) an isolation system that can consistently obtain CTCs as intact, whole cells (with high quality ribonucleic acid (RNA)) from blood with (2) a droplet-based digital polymerase chain reaction (PCR) assay focused on ribonucleic acid RNA markers of specific cancer lineages for each tumor type that are absent in blood of healthy patients.

When combined as described herein, these two concepts provide a CTC digital droplet PCR assay method (“CTC ddPCR”) or simply stated a “digital-CTC” assay (“d-CTC”). In some embodiments, the isolation system is a microfluidic system, such as a negative depletion microfluidic system (e.g., a so-called “CTC-Chip,” that uses negative depletion of hematopoietic cells, e.g., red blood cells (RBCs), WBCs, and platelets, to reveal untagged non-hematopoietic cells such as CTCs in a blood sample).

In general, the disclosure relates to methods for early detection of cancer with ultra-high sensitivity and specificity, wherein the methods include the use of microfluidic isolation of circulating tumor cells (CTCs) and digital detection of RNA derived from the CTCs. In some embodiments, the CTC-derived RNA can be converted into cDNA and encapsulated into individual droplets for amplification in the presence of reporter groups that are configured to bind specifically to cDNA from CTCs and not to cDNA from other cells. The droplets positive for reporter groups can be counted to assess the presence of disease, e.g., various types of cancer.

In another aspect, the disclosure relates to methods of analyzing circulating tumor cells (CTCs) in a blood sample. The methods include or consist of isolating from the blood sample a product comprising CTCs and other cells present in blood; isolating ribonucleic acid (RNA) molecules from the product; generating cDNA molecules in solution from the isolated RNA; encapsulating cDNA molecules in individual droplets; amplifying cDNA molecules within each of the droplets in the presence of reporter groups configured to bind specifically to cDNA from CTCs and not to cDNA from other cells; detecting droplets that contain the reporter groups as an indicator of the presence of cDNA molecules from CTCs in the droplets; and analyzing CTCs in the detected droplets.

The methods described herein can further include reducing a volume of the product before isolating RNA and/or removing contaminants from the cDNA-containing solution before encapsulating the cDNA molecules.

In various implementations of the new methods, generating cDNA molecules from the isolated RNA can include conducting reverse transcription (RT) polymerase chain reaction (PCR) of the isolated RNA molecules and/or amplifying cDNA molecules within each of the droplets can include conducting PCR in each droplet. In the new methods, encapsulating individual cDNA molecules and PCR reagents in individual droplets can include forming at least 1000 droplets of a non-aqueous liquid, such as one or more fluorocarbons, hydrofluorocarbons, mineral oils, silicone oils, and hydrocarbon oils and/or one or more surfactants. Each droplet can contain, on average, one target cDNA molecule obtained from a CTC. In some embodiments, the reporter groups can be or include a fluorescent label.

The new methods can include removing contaminants from the cDNA-containing solution by use of Solid Phase Reversible Immobilization (SPRI), e.g., immobilizing cDNA in the solution, e.g., with magnetic beads that are configured to specifically bind to the cDNA; removing contaminants from the solution; and eluting purified cDNA.

In various implementations, the methods described herein include using probes and primers in amplifying the cDNA molecules within each of the droplets that correspond to one or more genes selected from the list of cancer-selective genes in Table 1 herein. For example, the selected genes can include prostate cancer-selective genes, e.g., any one or more of AGR2, FOLH1, HOXDB13, KLK2, KLK3, SCHLAP1/SET4, SCHLAP1/SET5, AMACR, AR variants, UGT2B15/SET1, UGT2B15/SET5, and STEAP2 (as can be easily determined from Table 1). In another example, any one or more of ALDH1A3, CDH11, EGFR, FAT1, MET, PKP3, RND3, S100A2, and STEAP2 are selective for pancreatic cancer. Similar lists can be generated for the other types of cancers listed in Table 1.

In other examples, the selected genes include any one or more of the breast cancer-selective genes listed in Table 1. In other examples, the selected genes include genes selective for one or more of lung, liver, prostate, pancreatic, and melanoma cancer. For example, a multiplexed assay can include 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or even all of the selected genes that are listed in Table 1 as being selective for a particular type of cancer, e.g., breast cancer, lung cancer, prostate cancer, pancreatic cancer, liver cancer, and melanoma. Typically a group of primers and probes for 5 to 12 cancer-selective genes from Table 1 are used for a particular type of cancer. Other specific combinations of selected genes (markers for those genes) are described in the Examples below.

The methods can also include using one or more genes selective for two or more, three or more, four or more, or five or more different types of cancer. For example, the genes can be selective for breast cancer and lung cancer; breast cancer, lung cancer, and liver cancer; breast cancer, lung cancer, and pancreatic cancer; breast cancer, lung cancer, and prostate cancer; breast cancer, liver cancer, and melanoma; breast cancer, lung cancer, and melanoma; breast cancer, lung cancer, liver cancer, and prostate cancer; breast cancer, lung cancer, liver cancer, and melanoma; breast cancer, lung cancer, liver cancer, and pancreatic cancer; breast cancer, lung cancer, prostate cancer, and pancreatic cancer; breast cancer, lung cancer, liver cancer, melanoma, and pancreatic cancer; or breast cancer, lung cancer, liver cancer, melanoma, pancreatic cancer, and prostate cancer.

In the methods described herein, the CTCs can arise from metastatic or primary/localized cancers. In the new methods, the step of analyzing the CTCs in the detected droplets cam include monitoring CTCs from a blood sample from a patient, e.g., with a known cancer, e.g., over time, and testing and/or imaging the CTCs (e.g., using standard techniques) to provide a prognosis for the patient. In other embodiments, the step of analyzing the CTCs in the detected droplets can include testing and/or imaging the CTCs (e.g., using standard techniques) from a blood sample from a patient to provide an indication of a response by the CTCs to a therapeutic intervention.

In other embodiments, the step of analyzing the CTCs in the detected droplets includes determining a number or level of CTCs per unit volume of a blood sample from a patient to provide a measure of tumor burden in the patient. The methods can then further include using the measure of tumor burden in the patient to select a therapy or can further include determining the measure of tumor burden in the patient at a second time point to monitor the tumor burden over time, e.g., in response to a therapeutic intervention, e.g., for dynamic monitoring of tumor burden.

The methods and assays described herein can be used to amplify and detect CTCs in a wide variety of diagnostic, prognostic, and theranostic methods.

As used herein, the phrase “circulating tumor cells” (CTCs) refers to cancer cells derived from solid tumors (non-hematogenous cancers) that are present in very rare numbers in the blood stream of patients (e.g., about 1 CTC in about 10,000,000 WBCs in whole blood). CTCs can arise from both metastatic as well as primary/localized cancers.

As used herein, a “product” means a group of isolated rare cells and other contaminating blood cells, e.g., red blood cells, white blood cells (e.g., leukocytes), e.g., in some sort of liquid, e.g., a buffer, such as a pluronic buffer, that arise from processing in the methods described herein, e.g., using the systems described herein. A typical product may contain only about one to ten CTCs admixed with 500 to 2,500 or more WBCs, e.g., one to ten CTCs in a mixture of 1000 to 2000 WBCs. However, the limit of detection of the present methods can be about 1 CTC in 10,000 WBC. Thus, while the present methods can achieve a level of purity of about 1 CTC in 500 WBCs, the present methods do not require highly purified CTCs, as is required in some known methods of CTC analysis.

As used herein a Solid Phase Reversible Immobilization (SPRI) cleanup is a technique using coated magnetic beads to perform size selection on cDNA created from Reverse Transcription (RT)-PCR of a product. In the new assay methods described herein this accomplishes the two-fold purpose of (a) selecting only the cDNA of the correct size, and (b) removing harsh lysis detergents incompatible with the stability of the droplets.

The polymerase chain reaction (PCR) is a process of amplification of known DNA fragments by serial annealing and re-annealing of small oligonucleotide primers, resulting in a detectable molecular signal.

Reverse Transcription (RT)-PCR refers to the use of reverse transcription to generate a complementary c-DNA molecule from an RNA template, thereby enabling the DNA polymerase chain reaction to operate on RNA. An important aspect of the new methods disclosed herein is the availability of high quality RNA from whole cell CTCs that are not lysed or treated in such a way that might destroy or degrade the RNA.

As used herein, “positive droplets” are lipid-encapsulated molecules in which a PCR reaction performed with tagged primers allows visualization of the PCR amplified product. Thus, a droplet that contained a single template cDNA molecule of a particular targeted gene can become visible using fluorescence microscopy, while an “empty” or “negative” droplet is one that contains no targeted cDNA.

The new methods and systems provide numerous advantages and benefits. For example, the current methods and systems provide results that are far more accurate and robust than either of the prior known systems when used alone. By breaking down the signal from a single CTC into hundreds or thousands of brightly fluorescent droplets, each derived from a single cDNA molecule, the new digital-CTC assays enable dramatic signal amplification. Given the strict criteria in selecting and optimizing the biomarker genes described herein, the background signal from normal blood cells is negligible in d-CTC. Thus, d-CTC enables greatly amplified signal from patients with advanced cancer (nearly 100% of patients with prostate, lung, breast, and liver cancers). Not only is the fraction of patients with a positive score significantly increased, but the high level of signal enables dynamic measurements as tumor load declines following cancer therapy. In addition, the signal amplification permits detection of CTC-derived signatures even in patients with a very low tumor burden (something that is not readily accomplished with CTC cell imaging), thus enabling significantly earlier detection of cancer.

In sum, this novel microfluidics platform provides a streamlined, ultrahigh-throughput, rapid (e.g., 3 hours per run), and extremely high sensitivity method of enriching, detecting, and analyzing CTCs in patient blood samples. The platform provides rich, clinically actionable information, and with further optimization may enable early detection of cancer.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. lA is a graph showing cDNA dilutions prepared from total RNA of LNCaP prostate cancer cells, mixed with leukocytes and analyzed by droplet PCR using two different prostate primer sets. The results represent several purities and show good response of positive droplet number across this range.

FIG. 1B is a graph of manually isolated LNCaP cells spiked into healthy donor (HD) blood samples, run through the CTC-iChip, and subjected to droplet RT-PCR (KLK3 primer set). The results show excellent sensitivity down to low numbers of target cells.

FIG. 1C is a graph that shows the analysis of blood samples from healthy controls, patients with localized (resectable) prostate cancer and metastatic prostate cancer, processed through the CTC-iChip, subjected to RT-PCR and droplet analysis using three prostate-specific and one epithelial-specific biomarkers (KLK3, AMACR, FOLH1, EpCAM). The results are shown for the total number of droplets/ml for all four markers combined.

FIG. 2 is a signal intensity plot that shows KLK3 positive droplets derived from LNCAP prostate cancer cells spiked into blood and recovered using the CTC-iChip.

FIG. 3 is a bar graph that shows the minimal variation between experimental replicates and the retention of signal after sample processing through the CTC-iChip and shows increased detection sensitivity using the new assays described herein.

FIG. 4 is a signal intensity plot that shows the absence of four different cancer-specific marker-positive droplets in healthy donors using the new CTC digital droplet PCR assay methods described here (“CTC ddPCR” assay or simply “d-CTC” assay).

FIG. 5 is a signal intensity plot that shows a d-CTC assay multiplexed for four different lineage specific transcripts to detect prostate cancer cell lines spiked into blood.

FIGS. 6A to 7B are signal intensity plots showing d-CTC assays multiplexed for four different prostate cancer-specific transcripts per reaction. Both the theoretical model (FIGS. 6A and 7A) and cancer cell line data (FIGS. 6B and 7B) shown for two such reactions, Reactions 1 and 2, demonstrate that the theoretical model accurately predicts the experimental data.

FIGS. 8A to 13B are signal intensity plots showing d-CTC assays multiplexed for four different breast and lung cancer specific transcripts per reaction. Both the theoretical models (FIGS. 8A, 9A, 10A, 11A, 12A, and 13A) and cancer cell line data (FIGS. 8B, 9B, 10B, 11B, 12B, and 13B) shown for six such reactions, Reactions 1 through 6, each with different combinations of markers, demonstrate that the theoretic model accurately predicts the experimental data.

FIG. 14 is a bar graph showing droplet PCR signal for seven different biomarkers (PIP, PRAME, RND3, PKP3, FAT1, S100A2, and AGR2) from 1 ng of non-amplified cell-line cDNA and from 1 μl of pre-amplified product after 10, 14, and 18 cycles of Specific Target Amplification (STA) pre-amplification, demonstrating the significant enhancement of droplet PCR signal from STA pre-amplification.

FIGS. 15A to 15C are graphs that show the results of CTC detection in patients using the new d-CTC assay methods for three different sets of patients with lung cancer (FIG. 15A), breast cancer (FIG. 15B), and prostate cancer (FIG. 15C). In each, the healthy patients had no CTCs.

FIG. 16 is a horizontal bar graph that shows the results of patient prostate cancer data using a multiplexed d-CTC assay method described herein testing for the nine biomarkers recited in the figure (AGR2, Dual, FAT1, FOLH1, HOXB13, KLK2, KLK3, STEAP2, and TMPRSS2). 91 percent of cancer patients had detectable CTCs (10 of 11 patients), 24 of 28 samples contained detectable CTCs (86%), and 0 of 12 (0 percent) of healthy donor (HD) blood samples contained CTCs.

FIG. 17 is a series of signal intensity plots showing d-CTC assays multiplexed for for two different reactions (Reaction 1 (TMPRSS2, FAT1, KLK2, and STEAP2), left column, and Reaction 2 (KLK3, HOXB13, AGR2, and FOLH1), right column) for blood samples from a metastatic prostate cancer patient (top row), a localized prostate cancer patient (middle row), and from a healthy donor control sample (bottom row). In each case there were no CTCs in the healthy donor (HD) samples, but clear evidence of CTCs in the cancer samples.

FIG. 18 is a multiple bar graph illustrating the relative proportion of androgen receptor signaling genes in CTCs measured over time to provide a readout of drug response in a prostate cancer patient treated with Abiraterone®.

FIGS. 19A and 19B are graphs showing non-amplification versus 18 cycles of SMARTer pre-amplication. FIG. 19A is a bar graph that shows the level of amplicon amplification efficiency for different target regions that is consistent among the three replicates (WTA1, WTA2, WTA3). FIG. 19B is a graph that shows that using 18 cycles of SMARTer pre-amplification provides an increase in signal of approximately four orders of magnitude (10⁸ vs 10⁴) compared to a non-pre-amplified sample.

FIGS. 20A to 20C are graphs that show the results of testing of 11 markers in a multiplexed liver cancer assay. FIGS. 20A to 20C show the total droplet numbers in 21 hepatocellular carcinoma (HCC) patients (FIG. 20A), 13 chronic liver disease (CLD) patients (FIG. 20B, no significant detectable droplets) and 15 healthy donors (HDs) (FIG. 20C, no significant detectable droplets).

FIGS. 21A and 21B are graphs that show the results of a 14 marker multiplexed lung cancer assay. FIG. 21A shows the assay results for the 8 metastatic lung cancer patients and 8 healthy donors (all negative). FIG. 21B shows that all of the droplet counts per ml of blood in the cancer patients (8 of 8) were higher than in all healthy donors giving a detection rate of 100% in this assay.

FIG. 22 is a graph that shows the results of a breast cancer assay for a multiplexed eleven marker assay used in a field of 9 metastatic breast cancer patient, 5 localized breast cancer patients, and 15 healthy donors. The results show that the assay detects cancer in 7 of 9 metastatic breast cancer patients, 2 of 5 localized breast cancer patients, and none of the healthy donor samples.

FIGS. 23A and 23B are graphs that show the results of ARv7 detection in metastatic breast cancer patients. FIG. 23A is a bar graphs that shows the results for samples from 10 metastatic breast cancer patients and 7 healthy donors processed though the CTC-Chip as described herein. FIG. 23B shows that five of the ten cancer patient samples were above the healthy donor background level giving a detection rate of 5 in 10, or 50%.

FIG. 24A is a bar graph showing the detection rate of individual markers (PMEL, MLANA, MAGEA6, PRAME, TFAP2C, and SOX10) and a combined marker cocktail (SUM) in 34 melanoma patients.

FIG. 24B is a dot plot distribution of droplet signals detected in 34 melanoma patients for 182 draw points as compared to 15 healthy donors demonstrating an overall detection sensitivity above healthy donor background signal of 81% (based on draw points) and a specificity of 100% (by draw points).

DETAILED DESCRIPTION

The present disclosure relates to methods and systems to obtain information from rare cancer cells in blood samples. These methods and systems combine the power of isolation techniques such as ultrahigh-throughput microfluidic techniques, for example, negative depletion techniques, e.g., those using negative depletion of hematopoietic cells to isolate untagged CTCs in a blood sample, with analysis techniques, such as droplet-based digital polymerase chain reaction (PCR) assays focused on ribonucleic acid (RNA) markers of specific cancer lineages. This strategy can also be applied to other CTC isolation technologies that provide partially purification of cells (e.g., filtration, positive tumor cell selection), although the quality of the RNA and hence the sensitivity of the assay will be inferior to the microfluidic technologies. Similarly, other digital PCR technologies applied to RNA are capable of detecting lineage-specific primers, although the sensitivity of the droplet-based assay is likely to be the highest.

The new methods described herein can be used not only for early detection of cancers based on the presence of the CTCs in the blood, but also for tumor burden quantification as well as to monitor CTCs from a particular tumor over time, e.g., to determine any potential changes in specific tumor marker genes present in the CTCs as well changes in the tumor as the result of specific therapies, e.g., in the context of a clinical trial or a particular therapy.

General Concepts of the Assay Methods

The isolation techniques are used to enrich CTCs from a blood sample, e.g., using ultrahigh-throughput microfluidic such as the so-called “CTC-iChip” described in, for example, International PCT Application WO 2015/058206 and in Ozkumur et al., “Inertial Focusing for Tumor Antigen-Dependent and -Independent Sorting of Rare Circulating Tumor Cells,” Sci. Transl. Med., 5:179ra47 (2013). The CTC-iChip uses a CTC antigen-independent approach in which WBCs in the blood sample are labeled with magnetic beads, and the sample is then processed through two enrichment stages. The first stage uses deterministic lateral displacement to remove small and flexible cells/particles (RBCs, platelets, unbound magnetic beads, and plasma) while retaining larger cells (CTCs and WBCs). The second stage moves all cells into a narrow fluid stream using inertial focusing and then uses a magnetic field to pull bead-labeled WBCs out of the focused stream, leaving highly enriched CTCs. The CTC-iChip product from 10 ml of whole blood typically contains <500,000 RBCs, <5,000 WBCs, and a variable number of CTCs.

Some analysis techniques further enrich and analyze the isolated CTCs, e.g., as obtained from the CTC-iChip, e.g., using droplet microfluidics. Some basic information on droplet microfluidics is described generally in Jeremy et al., “Ultrahigh-Throughput Screening in Drop-Based Microfluidics for Directed Evolution,” Proc. Natl. Acad. Sci. USA, 107:4004 (2010).

As used herein, the droplet microfluidic techniques can, in certain implementations, include encapsulation of single cells, RT-PCR reagents, and lysis buffer into droplets of typically non-aqueous liquids (e.g., fluorocarbons, hydrofluorocarbons, mineral oil, silicone oil, and hydrocarbon oil; surfactants can also be include in the non-aqueous liquid, e.g., Span80, Monolein/oleic acid, Tween20/80, SDS, n-butanol, ABIL EM90, and phospholipids), in the size range of, e.g., about 0.5 pL to 15 nL in volume and, e.g., 10 to 300 μm, e.g., 20 to 100 μm, e.g., 30 to 50 μm, e.g., 35 μm in diameter. As used in the new methods described in the present disclosure, these techniques further include amplification of cancer-specific transcripts within the droplets to produce a fluorescent signal, and sorting of amplification-positive drops. This approach results in isolation of pure CTCs that can be sequenced and analyzed for the purposes of diagnosis and individualized drug therapy. Due to the high heterogeneity of CTCs, it is useful to use multiplexed amplification to detect as many CTCs as possible. Thus, instead of using one pair of primers in the PCR mixture, one can increase the probability of detecting and sorting CTCs using a combination of tumor specific primers. For additional information on the use of PCR for sorting cancer cells, see, e.g., Eastburn et al., “Identification and genetic analysis of cancer cells with PCR-activated cell sorting,” Nucleic Acids Research, 2014, Vol. 42, No. 16 e128.

In the new assay methods CTCs are lysed to release RNA molecules, which are representative of the genes expressed in a cancer cell. Most are “lineage” specific, rather than cancer specific, for example any prostate cell (whether cancerous or not) expresses these markers. However, normal blood cells do not, and the fact that the signal is derived from a cell circulating in the bloodstream defines it as an abnormal signal. By converting the RNA to a cDNA, we can now PCR amplify this lineage signal. We use droplet digital PCR, which is extraordinarily sensitive, allowing to convert the signal from a single cancer cell (i.e., one signal in an imaging assay) into thousands of positive immunofluorescent droplets. The combination of multiple, highly curated gene transcripts ensures high sensitivity and specificity for cancer, and also allows for functional insights (as in the status of hormone responsive pathways in prostate and breast cancers).

As noted, the new assay methods focus on the detection and analysis of high quality RNA rather than DNA. While there has been considerable work on DNA mutation detection in plasma and in CTCs, the present methods rely on RNA markers for the following reasons:

1. DNA mutations are not tumor specific, and the discovery that a healthy individual has some unidentified cancer cells in the blood is a very difficult clinical situation. In contrast, by selecting tumor-specific RNAs (e.g., prostate vs lung), the new methods can identify the source of cancer cells in the blood.

2. DNA mutations are very heterogeneous and besides a few recurrent mutations shared by many cancers, most blood-based mutation detection strategies require pre-existing knowledge of the mutations present in the primary tumor (i.e. not appropriate for screening for unknown cancers). In contrast, all tumor cells derived from specific organs express common lineage markers at the RNA level. Thus, a single cocktail of markers is used in the new methods for each individual type of cancer.

3. Low levels of CTCs are shed by invasive cancers before metastases are established (i.e., it is not too late for blood-based detection), but the presence of tumor cells in the blood connotes vascular invasion (i.e., invasive rather than indolent cancer). That is not the case for plasma DNA or plasma protein markers, which are leaked from dying cells in the primary tumor, and do not necessarily indicate vascular invasion. For example, serum PSA protein in the blood is shed by both benign prostate cells as well as primary prostate cancers. On the other hand, CTCs expressing PSA are shed only by invasive prostate cancers.

4. The analysis of RNA using the novel digital scoring technologies described herein is extraordinarily sensitive. However, free RNA is degraded in the bloodstream, and the use of isolation systems as described herein, such as microfluidic negative depletion systems (e.g., the CTC-Chip system) is unique in that the untagged tumor cells have high quality RNA which is extractable.

The choice of cDNA as a target molecule over DNA was made to not only to boost the signal originating from each tumor cell, but also to specifically target only tumor cell transcripts to the exclusion of white blood cell (WBC) transcripts. The boost in signal is a significant advantage, as it avoids the need for the isolation of CTCs to very high levels of purity. That is, it enables robust and repeatable results with products that contain one or more “isolated” CTCs that are still surrounded by hundreds or thousands of contaminating WBCs, e.g., leukocytes, in the same product. Nevertheless, the strategy of targeting cDNA made from RNA as used in the new methods allows the new assay methods to be exquisitely tailored for maximum specificity with minimal levels of CTC purity compared to prior approaches.

The CTC-iChip technology is highly efficient at isolating non-hematopoietic cells by microfluidic depletion of antibody tagged leukocytes. This feature of the CTC-iChip provides intact tumor-derived RNA (at levels far above those obtained using other technologies), and it is independent of tumor cell surface epitopes (which are highly heterogeneous among cancers and among epithelial vs mesenchymal cell subtypes within an individual cancer). Furthermore, even pre-apoptotic cancer cells whose antibody staining and selection is suboptimal for imaging analysis can provide a source of tumor-specific RNA that can be scored using the methods described herein. For all these reasons, an isolation technology or system that provides high quality RNA from intact CTCs with at least some reduction in the WBCs found in the sample along with the rare CTCs, such as a microfluidic negative depletion system, e.g., the CTC-iChip, is an important first step isolation before the tumor-specific digital readout is applied to the product.

The droplet-based digital detection of extremely rare molecules within a heterogeneous mixture was originally developed for PCR amplification of individual DNA molecules that are below detection levels when present within a heterogeneous mixture, but which are readily identified when sequestered within a lipid droplet before being subjected to PCR. The basic technology for droplet-based digital PCR (“Droplet Digital PCR (ddPCR)”) has been commercialized by RainDance and Bio-Rad, which provide equipment for lipid encapsulation of target molecules followed by PCR analysis. Important scientific advances that made this possible include work in the laboratory of David Weitz at Harvard and Bert Vogelstein at Johns Hopkins. For example, see U.S. Pat. Nos. 6,767,512; 7,074,367; 8,535,889; 8,841,071; 9,074,242; and U.S. Published Application No. 2014/0303005. See also U.S. Pat. No. 9,068,181.

However, droplet digital PCR itself is not biologically significant unless coupled to a biological source of material, which is key to the new methods described herein. For instance detection of lineage-specific RNAs (the central focus of our detection strategy) does not distinguish between normal prostate epithelial cells and cancerous prostate cells. As such, detection of prostate-derived transcripts in the blood is not meaningful: they are present within debris from normal prostate cells or exosomes. It is only when coupled with the isolation of whole CTCs (i.e., intact CTCs in the blood) that the ddPCR assay achieves both extraordinary sensitivity and specificity. Hence these two technologies are ideally suited for each other, because the isolation systems provide high quality RNA, and the droplet-based digital PCR assays are focused on RNA markers in the new methods.

One additional aspect is important to the overall success of the new assay methods. As noted, the new assay methods described herein use cDNA made from total RNA, but key to this use is the identification of appropriate biomarkers that are tumor lineage-specific for each type cancer, yet are so unique as to be completely absent in normal blood cells (even with ddPCR sensitivity). The selection, testing, and validation of the multiple target RNA biomarkers for each type of cancer described herein enable the success of the new assay methods.

Assay Method Steps

The new assay methods start with the isolation of partially pure CTCs using an isolation system, such as a microfluidic negative depletion system, up to and including the analysis of data from a droplet digital PCR instrument. There are eight main assay steps, some of which are optional, though generally provide better results:

1. isolating from the blood sample a product including CTCs and other cells present in blood; e.g. from a patient or a subject;

2. reducing a volume of the rare cell-containing product (optional);

3. isolating ribonucleic acid (RNA) molecules from the product, e.g., by cell lysis, and generating cDNA molecules in solution from the isolated RNA; e.g., by

RT-PCR of RNA released from cells contained in the product;

4. cleanup of cDNA synthesized during the RT-PCR step (optional);

5. pre-amplifying the cDNA using gene-specific targeted preamplification probes, e.g., using the Fluidigm BioMark™ Nested PCR approach, or non-specific whole-transcriptome amplification, e.g., using the Clontech SMARTer™ approach (optional);

6. encapsulating cDNA molecules in individual droplets, e.g., along with PCR reagents;

7. amplifying cDNA molecules within each of the droplets in the presence of reporter groups configured to bind specifically to cDNA from CTCs and not to cDNA from other cells, e.g., using PCR;

8. detecting droplets that contain the reporter groups (e.g., “positive” droplets) as an indicator of the presence of cDNA molecules from CTCs in the droplets; and

9. analyzing CTCs in the detected droplets, e.g., to determine the presence of a particular disease in a patient or subject.

As described in further detail below, one of the important features of the new d-CTC assay methods is the careful selection of a number of target gene biomarkers (and corresponding primers) that deliver excellent sensitivity, while simultaneously maintaining nearly perfect specificity. A unique list of target gene biomarkers described herein (Table 1, below) was determined using bioinformatics analyses of publicly available datasets and proprietary RNAseq CTC data. Great care was taken to select markers that are not expressed in any subpopulations of leukocytes, but are expressed at a high enough frequency and intensity in CTCs to provide a reliable signal in a reasonably wide array of different and distinct patients. A specific set of markers was selected for each cancer type (e.g. prostate cancer, breast cancer, melanoma, lung cancer, pancreatic cancer, among others.)

The separate steps of the assay methods will now be described in more detail.

1. CTC Isolation

Patient blood is run through the CTC-iChip, e.g., version 1.3M or 1.4.5 T and a sample is collected in a 15 mL conical tube on ice. CTC-iChips were designed and fabricated as previously described (Ozkumur et al., “Inertial Focusing for Tumor Antigen-Dependent and -Independent Sorting of Rare Circulating Tumor Cells,” Science Translational Medicine, 5(179):179ra47 (DOI: 10.1126/scitranslmed.3005616) (2013)).

The blood samples (˜20 mls per cancer patient) are collected in EDTA tubes using approved protocols. These samples are then incubated with biotinylated antibodies against CD45 (R&D Systems) and CD66b (AbD Serotec, biotinylated in house) and followed by incubation with Dynabeads® MyOne® Streptavidin T1 (Invitrogen) to achieve magnetic labeling of white blood cells (Ozkumur et al., 2013).

The sample is then processed through the CTC-iChip, which separates the blood components (red and white blood cells and platelets) as well as unconjugated beads away from the CTCs. The CTCs are collected in solution while the red blood cells, platelets, unconjugated beads and the tagged white blood cells are collected in a waste chamber. The process is automated and 10 ml of blood is processed in 1 hour.

2. Volume Reduction and Storage of the Rare Cell-Containing Product

To fully lyse all cells isolated in the product, it is preferable to reduce the product volume from a typical starting point of several milliliters to a final volume of about 100 μl. This can be achieved, for example, by centrifuging the product, and resuspending in pluronic buffer in preparation for cell lysis and generation of cDNA. At this point samples can be processed for long-term storage by adding RNAlater™ (ThermoFisher), followed by flash-freezing in liquid nitrogen and storage at −80 C.

3. Isolating RNA and Generation of cDNA from Cells in the Product

The RNA isolation step is important to the process to fully release all RNA molecules from cells in preparation for RT-PCR. A one-step, in-tube reaction can be used to minimize the risk of cell and RNA loss likely to be incurred during standard transfer steps. For example, one can use the lnvitrogen SuperScript III® First-Strand Synthesis Supermix® for qRT-PCR kit, by adding the RT-PCR mastermix directly to the pelleted product, pipetting to lyse fully, and performing the reaction according to the kit protocol targeting a 1:1 RNA:cDNA ratio. Once cDNA has been synthesized, RNase H is applied to the reaction to remove any remaining RNA. Alternatively, if one wants to perform whole transcriptome pre-amplification of the sample in a later step, cDNA can be synthesized using the SMARTer™ Ultra Low Input RNA Kit protocol, which uses proprietary oligonucleotides and reverse transcriptase enzyme.

4. Cleanup of cDNA Synthesized During RT-PCR

Another useful, yet optional, step in the process involves the removal of lysis reagents from the cDNA-containing solution. The presence of harsh detergents can lead to the destabilization of the droplets used in the ddPCR method, once the cDNA-containing solution is transferred to the ddPCR instrument. Detergent removal can be accomplished, e.g., through the use of Solid Phase Reversible Immobilization (SPRI). This technique uses coated magnetic beads to first bind cDNA of a specific size range, then allows removal of detergent-containing supernatant, and finally elution of pure cDNA for input into the ddPCR instrument. In addition to the cleanup of the RT-PCR, the SPRI process also accomplishes a size selection of cDNA, which reduces the number of non-target cDNA molecules that enter the ddPCR phase of the process, which in turn reduces background and noise.

5. Pre-Amplification

Pre-amplification of the cDNA is an optional step that increases the number of template molecules that can be detected in the droplet PCR step thus improving signal-to-noise ratio and boosting the confidence in a positive read-out. It can be a very powerful approach for the detection of markers that are expressed at low levels in CTCs, and for analyzing samples that contain very small numbers of possibly apoptotic CTCs, such as in the context of early detection of pre-metastatic disease. These two approaches have been modified to be applied in the workflow of d-CTC assay. Specific Targeted Amplification (STA), based on the Fluidigm BioMarkTM Nested PCR protocol, relies on the use of primers specifically designed to amplify the region targeted by the probes used in the droplet PCR step (see Table 2). These primers were carefully designed and tested in conjuncture with their respective fluorescent probes to ensure efficient and specific amplification without increase in noise in healthy controls. Alternatively, whole transcriptome amplification, based on the SMARTer™ Ultra Low Input RNA Kit protocol, relies on the amplification of every transcript in the product, including both those found in WBCs and those found in CTCs, using random primers.

6. Encapsulation of cDNA Plus PCR Reagents in Droplets

Once cDNA has been synthesized and purified of contaminating detergents, the entire aggregate of cDNA molecules in solution plus qPCR reagents is divided into many tiny compartmentalized reactions, for example, by a droplet making instrument, e.g., a droplet generator such as the Biorad Automated Droplet Generator, which generates 20,000 droplets per sample. Each reaction consists of an extremely small droplet of non-aqueous fluid, e.g., oil (PCR stable, e.g., proprietary formulation from vendor), which contains Taqman-type PCR reagents with gene-specific primers and an oligonucleotide probe, and a small amount of sample. Once droplet generation is complete, the sample consists of an emulsion containing a vast number of individual PCR-ready reactions.

For this step, one can use the PCR probes and related primers for any one or two or more different target genes listed in Table 1 below for overall determination of tumor load, e.g., to determine tumor progression or response to therapy, in single or multiplex reactions. Thus, although in some cases a single set of PCR primers and probes for a particular gene from Table 1 can be included in each droplet, it is also possible to multiplex PCR primers and probes for two or more different genes in each droplet using different fluorescent probes for each primer/probe set, to maximize the detection of tumor cells, given the heterogeneity of gene expression in CTCs. It is also possible to multiplex PCR primers and probes for multiple genes targeting different cancer types in each droplet, thus enabling the broad yet specific detection of multiple tumor types in a single assay.

7. PCR of Droplet Encapsulated cDNA Molecules

Standard PCR cycling is performed on the entire emulsion sample using qPCR cycling conditions. The reaction is carried to 45 cycles to ensure that the vast majority of individual droplet-PCR volumes are brought to endpoint. This is important because, although the reaction is performed with Taqman-type qPCR reagents and cycled under qPCR conditions, the fluorescent intensity of the sample will not be measured during the PCR cycling, but rather in the next step.

8. Detection of Positive Droplets

Since each individual partitioned PCR is brought fully to endpoint before any measurement of fluorescence is performed, each individual droplet will be either a fully fluorescent droplet or will contain virtually no fluorescence at all. This enables the simple enumeration of all positive (fluorescent) and negative (non-fluorescent) droplets.

9. Analysis

Because the upstream RT-PCR targeted a 1:1 RNA:cDNA ratio, each positive droplet should represent a single originating RNA transcript. This interpretation depends on the number of individual droplets far exceeding the number of target cDNA molecules. In the new process, at one extreme we consider the possibility of a single CTC being isolated and lysed, releasing some number of RNA transcripts which are then reverse-transcribed 1:1 into cDNA, partitioned, PCR-amplified, and enumerated.

We estimate that in the case of a moderately expressed gene, such as the KLK3 gene in prostate cancer cells, each cell contains approximately 80-120 copies of KLK3 mRNA. The Biorad QX200 ddPCR System generates 20,000 droplets, which ensures that for small numbers of isolated CTCs and moderately-expressed target genes there will never be more than one target cDNA molecule per droplet. On the other hand, in cases where the numbers of CTCs reach dozens or hundreds, for moderately-expressing genes there will likely be multiple copies of target cDNA per droplet. In such cases, approximate numbers of originating transcript can be estimated using Poisson statistics.

Novel Gene Panels to Enable Lineage-Specific Identification of CTCs

As discussed above, the identification of gene transcripts that are highly specific for cancer cells within the context of surrounding normal blood cells is central to the new methods. While many genes are known to be more highly expressed in cancer cells, the vast majority of these genes also typically have at least limited expression in normal tissues, including blood. Given the extraordinary sensitivity required for this assay, complete absence of signal in normal blood cells is essential for high confidence identification of tumor cells in the bloodstream.

Candidate tumor-specific transcripts used to detect CTCs in blood are first selected by analyzing publicly available gene expression data sets derived from breast, prostate, lung, pancreas, and liver cancers and melanoma, as well as our lab-generated single cell RNASeq data from CTCs isolated from breast, prostate and pancreatic cancer patients and mouse models of these cancers. Transcripts whose expression is restricted to tumors and absent or undetectable in blood components are chosen for further downstream analysis. Demonstrating and validating total absence of expression (with the highest level of sensitivity, i.e., Digital PCR assays) in normal blood cells is important. In general, we found that only ˜10% of candidate genes predicted based on computational models or RNA Seq data are truly negative in human blood samples.

In particular, candidate tumor-specific mRNA transcripts for the detection of CTCs were initially identified through the analysis of gene expression data sets (microarray and RNA-Seq) derived previously for human breast, prostate, lung, pancreas, hepatocellular, and melanoma cancers. Specific publically available data sets used for this analysis include The Cancer Genome Atlas (TCGA) (The Cancer Genome Atlas, available online at tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp) and the Cancer Cell Line Encyclopedia (CCLE) (available online at broadinstitute.org/ccle/home; see also, Barretina et al., The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity, Nature 483:603-607 (2012)). In addition, single-cell RNA-seq gene expression data from CTCs isolated from human patients with breast, prostate, and pancreatic cancers were analyzed (GEO accession numbers GSE51827, GSE60407, and GSE67980) (Aceto et al., Circulating tumor cell clusters are oligoclonal precursors of breast cancer metastasis, Cell, 158:1110-1122 (2014); Ting et al., Single-Cell RNA Sequencing Identifies Extracellular Matrix Gene Expression by Pancreatic Circulating Tumor Cells, Cell Rep, 8:1905-1918 (2014); and Miyamoto et al., RNA-Seq of single prostate CTCs implicates noncanonical Wnt signaling in antiandrogen resistance, Science 349:1351-1356 (2015). Tumor specific transcripts identified through these databases were then compared to human leukocyte RNA-Seq gene expression data (GEO accession numbers GSE30811, GSE24759, GSE51808, GSE48060, GSE54514, and GSE67980). Transcripts that displayed significant differential expression, with high expression in tumors and low or undetectable expression in leukocytes, were then selected for further downstream analysis. Moreover, a literature search was performed to select additional candidate tumor-specific transcripts. Between 50 and 100 candidate genes were selected for each type of human cancer.

For each candidate gene within each specific cancer type, two to four sets of PCR primers were designed to span regions across the target transcript. Primers are synthesized by IDT (Integrated DNA Technologies), probes are labeled with FAM or HEX, ZEN, and IABkFQ to create a probe targeting the middle of the amplicon. Unique features of our PCR primer design methodology necessary for the successful application of digital PCR-based mRNA transcript detection in human CTCs include the following: 1) the specific targeting of the 3′ end of each mRNA transcript, given the proclivity of cellular mRNA transcripts to degrade from the 5′-end, particularly in unfixed, fragile cells such as CTCs; 2) the design of primers to generate amplicons that span introns in order to exclude the unintentional amplification of contaminating genomic DNA, for example from excess contaminating leukocytes in the enriched CTC mixture; and 3) the design of primers to inclusively amplify multiple splice variants of a given gene, given the uncertainty in some cases regarding the clinical relevance of specific splice variants.

The specificity of the primers was first tested by qRT-PCR using cDNA derived from cancer cell lines (representing breast, prostate, lung, pancreas, and liver cancers and melanoma). For each type of human cancer, 2 to 5 established cancer cell lines were cultured and used for initial testing to evaluate PCR primer performance and assess for expression of the target transcript in the specified cancer. To provide an initial test of specificity, the same primers were used to evaluate expression of the target transcript in leukocytes from healthy individuals who do not have a diagnosis of cancer. Leukocytes from a minimum of five different healthy individuals were tested in this phase of testing (mixture of male and female individuals—this was dependent on the type of cancer; i.e. candidate prostate cancer and breast cancer genes required the use of male or female healthy donors only, respectively).

Leukocytes from healthy individuals were isolated from whole blood using Cell Preparation Tubes with Sodium Heparin (CPT) (Becton, Dickinson, and Co., NJ) following product insert instructions. RNA extraction and first-strand cDNA synthesis was performed for cancer cell lines and isolated leukocytes using standard methods. The specificity of expression of each gene (using 2 to 4 distinct sets of primers for each gene) was tested using qRT-PCR (cell line cDNA as positive controls, leukocyte cDNA from healthy donors as negative controls, and water as an additional negative control). Transcripts present in cancer cell lines, but absent in leukocytes based on qRT-PCR testing were then selected for further validation by droplet digital PCR. The selection criteria to pass this stage of testing were highly stringent, and required qRT-PCR signal to be present in at least one cancer cell line and absent in all healthy donor leukocyte samples tested.

Target transcripts and specific primer pairs that passed the qRT-PCR stage of testing were further validated using droplet digital PCR. For this stage of testing, the CTC-iChip (see, e.g., Ozkumur et al., “Inertial focusing for tumor antigen-dependent and -independent sorting of rare circulating tumor cells,” Sci Transl Med, 5, 179ra147 (2013) was used to process whole blood samples donated by healthy individuals. The CTC-iChip performs negative depletion of red blood cells, platelets, and leukocytes from whole blood, and generates a sample product that is enriched for cells in the blood that do not express leukocyte markers, including CTCs (which should not be present in healthy individuals). For each blood sample, the product from the CTC-iChip was supplemented with an RNA stabilization solution (RNAlater®, Life Technologies) and processed for RNA extraction and cDNA synthesis using standard methods. Droplet digital PCR (Biorad, CA) was then used to quantitate the number of transcripts present in each sample based on the specific primer pairs being tested. Samples assessed by droplet digital PCR during this phase of testing included cDNA from cancer cell lines, leukocyte cDNA from healthy donors processed through the CTC-iChip (at least four healthy individuals per primer pair being tested), and water as a negative control.

Criteria for passing droplet digital PCR testing were stringent, and included: 1) the presence of transcript signal in cancer cell lines (at least one cell line with >10 positive droplets); 2) excellent signal-to-noise ratio represented by separation of signal between positive and negative (empty) droplets; 3) minimal or absent droplet signal in healthy donors (<3 droplets per healthy donor); and 4) absent droplet signal in water (0 positive droplets).

Primers that amplified transcripts specifically in cell lines and not in leukocytes in the above droplet digital PCR testing were then subjected to detailed testing of sensitivity of signal. Using single cell micromanipulation, precise numbers of cancer cells (1, 5, 10, 25, and 50 cells) were spiked into whole blood donated by healthy individuals, and then processed through the CTC-iChip. Each sample was then processed as above for testing with droplet digital PCR, and evaluated for sensitivity to ensure the signal was sufficient for the desired clinical application.

The above stringent procedure of evaluating candidate genes and primers using qRT-PCR and droplet digital PCR resulted in a final primer list consisting of approximately 10% of the initial list of 50-100 candidate genes for each type of cancer (total of approximately 400 initial candidate genes). These primers are then further evaluated for signal in patient CTCs using blood samples donated by cancer patients undergoing cancer treatment at the MGH Cancer Center, collected under an IRB-approved clinical protocol. Key to this portion of the evaluation is a comparison with blood collected from healthy individuals without a diagnosis of cancer. The following Table 1 lists the primers and probes for that have been developed thus far using these methods for the specific detection of CTCs from patients with prostate, breast, hepatocellular, pancreatic, lung, and melanoma cancers using droplet digital PCR.

While a single gene for each cancer type could be used, the presence of multiple genes within each panel is useful both for sensitivity (CTCs are heterogeneous even within individual patients in their expression patterns) and specificity (detection of multiple gene signals confers added confidence that this represents a true cancer cell signature).

The gene list provided below in Table 1 includes transcripts that are unique to specific types of cancer (e.g., highly specific markers of prostate or breast or liver cancers), as well as genes that are shared by several cancer types, e.g., all epithelial cancer types (and thus may serve as pan-cancer markers), and genes that are induced in certain conditions (e.g., active androgen signaling in prostate cancer or active estrogen signaling in breast cancer). Thus, each type of cancer was assigned a specific panel of genes that is designed for optimal sensitivity, specificity, and clinically actionable information for the given cancer type.

In addition, primers described in Table 2 are designed to pre-amplify some of the genes listed in Table 1, while maintaining their high specificity. If STA is a method of choice, these nested primers become additional components of each cancer panel.

Gene Lists for Different Types of Cancers

The following Table 1 provides a list of names of genes (with (Genbank ID) and Sequence Identification numbers (SEQ ID NO)), along with cancer types for which they are selective (Br: breast, Lu: lung, Li: liver, Pr: prostate, Panc: pancreatic, Mel: melanoma). In addition, optimized primer sets are listed for each gene (primers 1 and 2), along with the composition of the fluorescent primer probes (e.g., 6-FAM™ (blue fluorescent label) or HEX™ (green fluorescent label) for tagged probes, and ZEN-31ABkFQ quencher) for optimal visualization of the digital PCR product.

TABLE 1 Disease Seq Seq Seq Gene Group ID Primer 2 ID Primer 1 ID Probe AGR2 Br, Lu, Li, 1 CTG ACA GTT AGA 2 CAA TTC AGT CTT 3 /56-FAM/ATG CTT ACG /ZEN/AAC CTG Pr GCC GAT ATC AC CAG CAA CTT GAG CAG ATA CAG CTC /31ABkFQ/ ALDH1A3 Br, Lu, 4 GGT GGC TTT AAA 5 TGT CGC CAA GTT 6 /56-FAM/TTT TCA CTT /ZEN/CTG TGT (220) Panc ATG TCA GGA A TGA TGG T ATT CGG CCA AAG C/E1ABkFQ/ CADP52 Br, Li, Lu, 7 CTC TGC ATT TTT 8 GCC TTG CAC TTC 9 /56-FAM/TCC GAC GTG /ZEN/GTA CTG (93664) Mel GGACAT AGG AG CAT TAT GAC TCA TTC ACC T/31ABkFQ/ CDH11 Br, Lu, 10 GAG GCC TAC ATT 11 GTG GTT CTT TCT 12 /56-FAM/CAT CCT CGC /ZEN/CTG CAT (1009) Panc CTG AAC GC TTT GCC TTC TC CGT CAT TCT /31ABkFQ/ CDH3 Br, Li, Mel 13 GTT TCA TCC TCC 14 GCT CCT TGA TCT 15 /56-FAM/CTG CTG GTG /ZEN/CTG CTT (1001) CTG TGC TG TCC GCT TC TTG TTG GT/31ABkFQ/ COL8A1 Br, Lu 16 GAT GCC CCA CTT 17 CCT CGT AAA CTG 18 /56-FAM/AGT ATC CAC /ZEN/ACC TAC (1295) GCA GTA GCT AAT GGT CCC AAT ATA TGA AGG AAA /31ABkFQ/ EGFR Br, Lu, Li, 19 CTG CTG CCA CAA 20 TTC ACA TCC ATC 21 /56-FAM/CTG CCT GGT /ZEN/CTG CCG (1956) Panc CCA GT TGG TAC GTG CAA ATT C/31ABkFQ/ FAT1 Br, Lu, Li, 22 GAT CCT TAT GCC 23 ATC AGC AGA GTC 24 /56-FAM/TCT TGT CAG /ZEN/CAG CGT (2195) Mel, Pr, ATC ACC GT AAT CAG TGA G TCC CGG /31ABkFQ/ Panc FAT2 Br, Lu 25 CCT GGA TGC TGA 26 TCC TCC ACT CAT 27 /56-FAM/ACC TGC TAC /ZEN/ATC ACA (2196) CAT TTC TGA CTC CAA CT GAG GGA GAC C/31ABkFQ/ FOLH1 Pr 28 CAA TGT GAT AGG 29 TGT TCC AAA GCT 30 /56-FAM/ATG AAC AAC /ZEN/AGC TGC (2346) TAC TCT CAG AGG CCT CAC AA TC ACT CTG A/31ABkFQ/ HOXB13 Br, Lu, Pr 31 CAG CCA GAT GTG 32 CTG TAC GGA ATG 33 /56-FAM/CAG CAT TTG /ZEN/CAG ACT (261729) TTG CCA CGT TTC TTG CCA GCG G/31ABkFQ/ KLK2 Pr 34 GCT GTG TAC AGT 35 GTC TTC AGG CTC 36 /56-FAM/TGG CTA TTC /ZEN/TTC TTT (2917) CAT GGA TGG AAA CAG GT AGG CAA TGG GCA /31ABkFQ/ KLK3 Pr 37 GTG TGC TGG ACG 38 GTG ATA CCT TGA 39 /56-FAM/AAA GCA CCT /ZEN/GCT CGG (354) CTG GA AGC ACA CCA TTA GTG ATT CT/31ABkFQ/ C LSAMP Mel 40 CAC ATT TGA GTG 41 GCG GAT GTC AAA 42 /56-FAM/TCC AAG AGC /ZEN/AAT GAA (4045) AAG CTT GTC G CAA GTC AAG GCC ACC ACA /31ABkFQ/ MAGEA6-RM1 Mel 43 GAA GGA GAA GAT 44 GCT GAC TCC TCT 45 /56-FAM/TTG CCC TGA /ZEN/CCA GAG (4105) CTG CCA GTG GCT CAA G TCA TCA TGC /31ABkFQ/ MET Br, Li, Lu, 46 CCA GTA GCC TGA 47 TGT CAG TGA TTC 48 /56-FAM/AGT CAT AGG /ZEN/AAG AGG (4233) Panc TTG TGC AT TGT TCA AGG A GCA TTT TGG TTG T/31ABkFQ/ MLANA Mel 49 ACT CTT ACA CCA 50 CCA TCA AGG CTC 51 /56-FAM/AAG ACT CCC /ZEN/AGG ATC (2315) CGG CTG A TGTG ATC CAT ACT GTC AGG A/31ABkFQ/ NPV1R Br, Lu 52 GGA TCT GAG CAG 53 GAA TTC TTC ATT 54 /56-FAM/AGC AGG AGC /ZEN/GAA AAA (4886) GAG AAA TAC C CCC TTG AAC TGA GAC AAA TTC CAA AG/31ABkFQ/ OCLN Br, Lu, Li 55 AAG ATG GAC AGG 56 ACT CTT TCC ACA 57 /56-FAM/TGC AGA CAC /ZEN/ATT TTT (100506658) TAT GAC AAG TC TAG TCA GAT GG AAC CCA CTC CTC G/31ABkFQ/ PDZRN3 Mel 58 TGT CCT GGC TGT 59 TGG ATC CCT ATC 60 /56-FAM/AGC TCC TCC /ZEN/CTG TCC (23024) TCA TTC TG TCT TGC CA ATC TCC T/31ABkFQ/ PGR Br 61 GGC AAT TGG TTT 62 GGA CTG GAT AAA 63 /56-FAM/ACA AGA TCA /ZEN/TGC AAG (5241) GAG GCA A TGT ATT CAA GCA TTA TCA AGA AGT TTT GTA AGT T/ 31ABkFQ/ PKP3 Br, Li, Lu, 64 CTG GTG GAG GAG 65 GGT CTC TGG ATG 66 /56-FAM/AGT GTC CGC /ZEN/AGC AGC (11187) Panc AAC GG AAA GGT T TCG AA/31ABkFQ/ PMEL Mel 67 CAG GCA TCG TCA 68 ACA CAA TGG ATC 69 /56-FAM/TTT GGC TGT /ZEN/GAT AGG (6490) GTT TCC T TGG TGC TAA TGC TTT GCT G/31ABkFQ/ PPL Br, Lu, Li 70 GAG GAG AGA ATC 71 AGG TTC AGG TAC 72 /56-FAM/AGG AAC TCC /ZEN/ATT GAG (5493) AAC AAA CTG C TCC TTC CAG GCG CAC AT/31ABkFQ/ RXRG Mel 73 ATA CTT CTG CTT 74 AGC CAT TGT ACT 75 /56-FAM/CTC TGA GGT /ZEN/GGA GAC (6258) GGT GTA GGC CTT TAA CCC A TCT GCG AGA /31ABkFQ/ RND3 Br, Lu, Li, 76 CCG AGA ATT ACG 77 GCG GAC ATT GTC 78 /56-FAM/ACG GCC AGT /ZEN/TTT GAA (350) Mel, Panc TTC CTA CAG TG ATA GTA AGG A ATC GAC ACA C/31ABkFQ/ S100A2 Br, Lu, Li, 79 CTG CCT TGC TCT 80 CTT ACT CAG CTT 81 /56-FAM/ACC TGG TCT /ZEN/GCC ACA (6273) Panc CCT TCC GAA CTT GTC G GAT CCA TG/31ABkFQ/ SCGBZA1 Br 82 ACT TCC TTG ATC 83 GTC TTT TCA ACC 84 /56-FAM/CCA TGA AGC /ZEN/TGC TGA (4246) CCT GCC A ATG TCC TCC A TGG TCC TCA /31ABkFQ/ SFRP1 Mel 85 CAA TGC CAC CGA 86 CTT TTA TTT TCA 87 /56-FAM/TGT GAC AAC /ZEN/GAG TTG (6422) AGC CT TCC TCA GTG CAA AAA TCT GAG GCC /31ABkFQ/ AC SOX10 Mel 88 CTT GTC ACT TTC 89 CTT CAT GGT GTG 90 /56-FAM/TTG TGC AGG /ZEN/TGC GGG (6663) GTT CAG CAG GGC TCA TAC TGG/31ABkFQ/ SCHLAP1/ Pr 91 TCC TTG GAT GAC 92 AGA TAC CAC CTC 93 /56-FAM/CCA ATG ATG /ZEN/AGG SET4 TCT CCC TAC CCT GAA GAA AGC GGG ATG GAG /31ABkFQ/ (101669767) SCHLAP1 Pr 94 AGA GGT TTA ATG 95 CTC TGG TCT GTC 96 /56-FAM/ACA TGC CTT /ZEN/TCA SET 5 GGC TCA CAG GTC ATG TAA G CCT TCT CCA CCA /31ABkFQ/ AMACR Pr 97 CAC ACC ACC ATA 97 TCA CTT GAG GCC 99 /56-FAM/AGA AAC GGA /ZEN/GGT (23600) CCT GGA TAA T AAG AGT TC CCA GCC AAG TTC /31ABkFQ/ AR Pr 100 CTT TCT TCA GGG 101 CTT GTC GTC TTC 102 /56-FAM/AAG CAG GGA /ZEN/TGA Variant 7/ TCT GGT CAT T GGA AAT GTT ATG CTC TGG GAG AAA /31ABkFQ/ SET1 (367) AR Pr 103 GAG GCA AGT CAG 104 TGT CCA TCT TGT 105 /56-FAM/TGA AGC AGG /ZEN/GAT Variant 7 CCC TTC T CGT CTT CG GAC TCT GGG AGA /31ABkFQ/ SET 3 AR Pr 106 GCT CAC CAT GTG 107 TGG GAG AGA GAC 108 /56-FAM/TGA TTG CGA /ZEN/GAG Variant 12 TGA CTT GA AGC TTG TA AGC TGC ATC AGT /31ABkFQ/ SET 1 AR Pr 109 GAA AGT CCA CGC 110 GCA GCC TTG CTC 111 /56-FAM/TGA TTG CGA /ZEN/GAG Variant 12 TCA CCA T TCT AGC AGC TGC ATC AGT /31ABkFQ/ SET 4 UGT2B15 Pr 112 CTC TGC ACA AAC 113 TTT CCT CGC CCA 114 /56-FAM/TTG GCT GGT /ZEN/TTA SET 1 TCT TCC ATT TC TTC TTA CC CAG TGA AGT CCT CC/31ABkFQ/ (7366) UGT2B15 PR 115 GGA AGG AGG 116 GTG AGC TAC TGG 117 /56-FAM/TGG CTA CAC /ZEN/ATT SET 5 GAA CAG AAA TCC CTG AAC TAT T TGA GAA GAA TGG TGG A/31ABkFQ/ AFP Li 118 AGG AGA TGT GCT 119 TCT GCA TGA ATT 120 /56-FAM/AAT GCT GCA /ZEN/AAC SET 1 GGA TTG TC ATA CAT TGA CCA TGA CCA CGC TG/31ABkFQ/ (174) C AFP  Li 121 ACT GCA GAG ATA 122 TCA CCA TTT TGC 123 /56-FAM/TTG CCC AGT /ZEN/TTG SET 2 AGT TTA GCT GAC TTA CTT CCT TG TTC AAG AAG CCA C/31ABkFQ/ STEAP2 Br, Lu, 124 CAT GTT GCC TAC 125 TCT CCA AAC TTC 126 /56-FAM/ACA TGG CTT /ZEN/ATC AGC (261729) Pr, Panc AGC CTC T TTC CTC ATT CC AGG TTC ATG CA/31ABkFQ/ TEAD3 Br, Lu, Li 127 GAA GAT CAT CCT 128 CTT CCG AGC TAG 129 /56-FAM/AGC GTG CAA /ZEN/TCA ACT (7005) GTC AGA CGA G AAC CTG TAT G CAT TTC GGC /31ABkFQ/ TFAP2C Br, Lu, 130 GAT CAG ACA GTC 131 GAC AAT CTT CCA 132 /56-FAM/ACA GGG GAG /ZEN/GTT CAG (7022) Mel ATT CGC AAA G GGG ACT GAG AGG GTT CTT /31ABkFQ/ TMPR5S2 Pr 133 CCC AAC CCA GGC 134 TCA ATG AGA AGC 135 /56-FAM/ACC CGG AAA /ZEN/TCC AGC (7113) ATG ATG ACC TTG GC AGA GCT /31ABkFQ/ GPC3 Li 136 TGC TGG AAT GGA 137 GCT CAT GGA GAT 138 /56-FAM/TCC TTG CTG /ZEN/CCT TTT (2719) CAA GAA CTC TGA ACT GGT GGC TGT ATC T/31ABkFQ/ ALB Li 139 CTT ACT GGC GTT 140 CCA ACT CTT GTA 141 /56-FAM/ACA TTT GCT /ZEN/GCC CAC (219) TTC TCA TGC GAG GTC TCA AG TTT TCC TAG GT/31ABkFQ/ G6PC Li 142 GGA CCA GGG AAA 143 GCA AGG TAG ATT 144 /56-FAM/ACA GCC CAG /ZEN/AAT CCC SET 1 GAT AAA GCC CGT GAC AGA AAC CAC AAA/31ABkFQ/ (2538) G6PC Li 145 CAT TTT GTG GTT 146 GAT GCT GTG GAT 147 /56-FAM/CTG TCA GCA /ZEN/ATC TAC SET 2 GGG ATT CTG G GTG GCT CTT GCT GCT CA/31ABkFQ/ PRAME Mel 148 GCC TTG CAC TTC 149 CTC TGC ATT TTT 150 /56-FAM/CAA GCG TTG /ZEN/GAG GTC (23532) CAT TAT GAC GGA CAT AGG AG CTG AGG C/31ABkFQ/ AHSG Li 151 ATG TGG AGT TTA 152 AGC TTC TCA CTG 153 /56-FAM/CCA CAG AGG /ZEN/CAN CCA (197) CAG TGT CTG G AGT GTT GC AGT GTA ACC /31ABkFQ/ GPR143 Mel 154 ACG GCT CCC ATC 155 CCA CTA TGT CAC 156 /56-FAM/TTC GCC ACG /ZEN/AGA ACC (4935) CTC ACT CAT GTA CCT G AGC AGC /31ABkFQ/ PTPRZ1 Mel 157 TGC TCT GAC AAC 158 GGC TGA GGA TCA 159 /56-FAM/AGG CCA GGA /ZEN/GTC TTT (5803) CCT TAT GC CTT TGT AGA GCT GAC ATT /31ABkFQ/ MUCL1 Br 160 CAT CAG CAG GAC 161 TGT CTG TGC TCC 162 /56-FAM/ACT CCC AAG /ZEN/AGT ACC (118430) CAG TAG C CTG ATC T AGG ACT GCT /31ABkFQ/ PIP Br 163 TCA TTT GGA CGT 164 CTT GCT CCA GCT 165 /56-FAM/CCT GCT CCT /ZEN/GGT TCT (5304) ACT GAC TTG G CCT GTT C CTG CCT G/31ABkFQ/ PGR Br 166 GGT GTT TGG TCT 167 ACT GGG TTT GAC 168 /56-FAM/AGT GGG CAG /ZEN/ATG CTG (5241) AGG ATG GAG TTC GTA GC TAT TTT GCA C/31ABkFQ/ TFAP2C Br, Lu 169 GTG ACT CTC CTG 170 CCA TCT CAT TTC 171 /56-FAM/TTC GGC TTC /ZEN/ACA GAC (7022) ACA TCC TTA G GTC CTC CAA ATA GGC AAA GT/31ABkFQ/ SCGB2A1 Br 172 ACT CTG AAA AAC 173 TCT AGC AAT CAA 174 /56-FAM/TAG CCC TCT /ZEN/GAG CCA (4246) TTT GGA CTG ATG CAG ATG AGT TCT AAC GCC /31ABkFQ/ FAT1 Br, Lu, Pr 175 AGC TCC TTC CAG 176 GTC TGC TCA TCA 177 /56-FAM/ATC CCA GTG /ZEN/ATA CCC (2195) TCC GAA T ATC ACC TCA ATT GTC ATC GC/31ABkFQ/ FAT2 Br, Lu, Pr 178 GGA CAG AGA GAA 179 TGT GGG AGA ATA 180 /56-FAM/TGG AGG TGA /ZEN/CTG TGC (2196) CAA GGA TGA AC TAG GTG GAT TG TGG ACA ATG /31ABkFQ/ RND3 Br, Lu 181 GCT TTG ACA TCA 182 CTG TCC GCA GAT 183 /56-FAM/ACA GTG TCC /ZEN/TCA AAA (390) GTA GAC CAG AG CAG ACT TG AGT GGA AAG GTG A/31ABkFQ/ SFTP8 Lu 184 CCT GGA AAA TGG 185 CAT TGC CTA CAG 186 /56-FAM/CCG ATG ACC /ZEN/TAT GCC (6439) CCT CCT T GAA GTC TGG AAG AGT GTG AG/31ABkFQ/ SCGB3A2 Lu 187 CCA GAG GTA AAG 188 TCC CAG ATA ACT 189 /56-FAM/AAG GCA GTA /ZEN/GCA GAG (117156) GTG CCA AC GTC ATG AAG C TAA CTA CAA AGG C/31ABkFQ/ SERPINA3 Br, Lu 190 CCT CAA ATA CAT 191 GGA AGC CTT CAC 192 /56-FAM/TAG CAG TCT /ZEN/CCC AGG (12) CAA GCA CAG C CAG CAA TGG TCC A/31ABkFQ/ SFRP2 Br, Lu 193 TTG CAG GCT TCA 194 GCC CGA CAT GCT 195 /56-FAM/TTT CCC CCA /ZEN/GGA CAA (6423) CAT ACC TT TGA GT CGA CCT TT/31ABkFQ/ CRABP2 Br, Lu 196 CTC TTG CAG CCA 197 CCC TTA CCC CAG 198 /56-FAM/TTT CTT TGA /ZEN/CCT CTT (1382) TTC CTC TT TCA CTT CT CTC TCC TCC CCT/31ABkFQ? AQP4 Lu 199 TGG ACA GAA GAC 200 GGT GCC AGC ATG 201 /56-FAM/CCG ATC CTT /ZEN/TGG ACC (361) ATA CTC ATA AAG AAT CCC TGC AGT TAT CA/31ABkFQ/ G TMPRS54 Br, Lu 202 ATC TTC CCT CCA 203 CAG TTC CCA CTC 204 /56-FAM/CTC ACT CCA /ZEN/GCC ACC (58649) TTC TGC TTC ACT TTC TCA G CCA CTC /31ABkFQ/ GREM1 Lu 205 TTT TGC ACC AGT 206 GCC GCA CTG ACA 207 /56-FAM/CCT ACA CGG /ZEN/TGG GAG (26585) CTC GCT T GTA TGA G CCC TG/31ABkFQ/ FOXF1 Lu 208 CGA CTG CGA GTG 209 CTC TCC ACG CAC 210 /56-FAM/CTG CAC CAG /ZEN/AAC AGC (2294) ATA CCG TCC CT CAC AAC G/31ABkFQ/ NKX2-1 Lu 211 TGC CGC TCA TGT 212 CAG GAC ACC ATG 213 /56-FAM/CCC GCC ATC /ZEN/TCC CGC (7080) TCA TGC AGG AAC AG TTC A/31ABkFQ/ NKX2-1 Lu 214 AAG ATG TCA GAC 215 CGA AGC CCG ATG 216 /56-FAM/ATG TCG ATG /ZEN/AGT CCA (7080) ACT GAG AAC G TGG TC AAG CAC ACG A/31ABkFQ/ AFP Li 217 AGGAGATGTGCTGGA 218 TCTGCATGAATTATA 219 /56-FAM/AAT GCT GCA /ZEN/AAC TG (174) TTGTC CATTGACCAC CCA CGC TG/31ABkFQ/ AHSG Li 220 AATGTGGAGTTTACA 221 AGCTTCTCACTGAGT 222 /56-FAM/CCA CAG AGG /ZEN/CAG CCA (197) GTGTCTGG GTTGC AGT GTA ACC /31ABkFQ/ ALB Li 223 GAG ATC TGC TTG 224 CAA CAG AGG TTT 225 /56-FAM/AGA TAT ACT /ZEN/TGG CAA (213) AAT GTG CTG TTC ACA GCA T GGT CCG CCC /31ABkFQ/ ALB Li 226 CAT GGT AGG CTG 227 GAC GAT AAG GAG 228 /56-FAM/ACT TGT TGC /ZEN/TGC AAG (213) AGA TGC TTT ACC TGC TTT G TCA AGC TGC /31ABkFQ/ ALB Li 229 GCG CAT TCT GGA 230 GCT ATG CCA AAG 231 /56-FAM/ACC TCT TGT /ZEN/GGA AGA (213) ATT TGT ACT C TGT TCG ATG GCC TCA GAA /31ABkFQ/ APOH Li 232 TGA TGG ATA TTC 233 CCT GAA TCT TTA 234 /56-FAM/CCA GTT TCC /ZEN/CAG TTT (350) TCT GGA TGG C CTC TCT CTC CTT GGT ACA TTC TAT TTC TTC C/ G 31ABkFQ/ FABP1 Li 235 GCA CTT CAA GTT 236 ACC AGT TTA TTG 237 /56-FAM/AAC CAC TGT /ZEN/CTT GAC (2168) CAC CAT CAC TCA CCT TCC A TTT CTC CCC TG/31ABkFQ/ FGB Li 238 ACA TCT ATT ATT 239 TGG GAG CCT CTT 240 /56-FAM/ACC CTC CTC /ZEN/ATT GTC (2244) GCT ACT ATT GTG CTC TCT TC GTT GAC ACC /31ABkFQ/ TGT T FGG Li 241 TTC ATT TGA TAA 242 ACC TTG AAC ATG 243 /56-FAM/TGC CAT TCC /ZEN/AGT CTT (2266) GCA CAC AGT CTG GCA TAG TCT G CCA GTT CCA C/31ABkFQ/ GPC3 Li 244 AATCAGCTCCGCTTC 245 TGCTTATCTCGTTGT 246 /56-FAM/TTC CAG GCG /ZEN/CAT CAT (2719) TTG CCTTCG CCA CAT CC/31ABkFQ/ RBP4 Li 247 CAG AAG CGC AGA 248 TCT TTC TGA TCT 249 /56-FAM/AGG CTG ATC /ZEN/GTC CAC (5950) AGA TTG TAA G GCC ATC GC AAC GGT T/31ABkFQ/ TF Li 250 AGA AGC GAG TCC 251 CAC TGC ACA CCA 252 /56-FAM/CCA GAC ACA /ZEN/ GCC CCA (7018) TAC TGT TCT CAC A GGA CG/31ABkFQ/

Note that PRAME is also named MAPE (Melanoma Antigen Preferentially Expressed In Tumors), OIP4 (Opa-Interacting Protein OIP4), and CT130 (Cancer/Testis Antigen 130).

The following Table 2 lists nested primers designed to specifically pre-amplify the regions targeted by primers listed in Table 1.

TABLE 2 Primer name Seq ID Nested Forward Seq ID Nested Reverse FAT1 253 CAG ATG GAG GAG GAA GAT TCT G 254 GTA TAC TGC CTG GAG TTC TCT G FAT2 255 CTG GTT CAG GTC TCC ATT ACA G 256 GCT GTG ACT CTG AGC AAG TA AGR2 257 TGT CCT CCT CAA TCT GGT TTA TG 258 GAC AGA AGG GCT TGG AGA TTT PKP3 259 CGG TGG CGT TGT AGA AGA T 260 AGA AGA TCT CTG CCT CCG A RND3 261 CAA GAT AGT TGT GGT GGG AGA C 262 AGG GTC TCT GGT CTA CTG ATG TFAP2C 263 TTTGGATTTACCGCTTGGG 264 GACFTCCAGTGTGGGAGAG S10DA2 265 GGG CCC ACA TAT AAA TCC TCA C 266 CTG CTG GTC ACT GTT CTC ATC PRAME 267 CTTCGCGGTGTGGTGAA 268 GCTGTGTCTCCCGTCAAA PIP 269 CTG GGA CAC ATT GCC TTC T 270 CCA CCA TGC ATT CTT TCA ATT CT PGR 271 AAA CCC AGT TTG AGG AGA TGA G 272 CCC TGC CAA TAT CTT GGG TAA T SCGB2A1 273 ACA GCA ACT TCC TTG ATC CC 274 GCG GCA TCA CTG TCT ATG AA MUCL1 275 CCT TGC CTT CTC TTA GGC TTT 276 AGC AGT GGT TTC AGC ATC A PGR 277 CAG ATA ACT CTC ATT CAG TAT TCT TGG 278 CTC TAA TGT AGC TTG ACC TCA TCT TFAP2C 279 GAG AAG TTG GAC AAG ATT GGG 280 GCT GAG AAG TTC TGT GAA TTC TTT A SCGB2A1 281 GTT TCC TCA ACC AGT CAC ATA GA 282 AGT TGT CTA GCA GTT TCC ACA TA FAT1 283 GGG AAA GCC TGT CTG AAG TG 284 TCG TAG CCT CCA GGG TAA TAG FAT2 285 GTT ACA GGI CTC CTA TCT ACA GC 286 GCT CAG CCT CTC TGG AAG RND3 287 CTC TCT TAC CCT GAT TCG GAT G 288 GGC GTC TGC CTG TGA TT SFTPB 289 CCT GAG TTC TGG TGC CAA AG 290 GGG CAT GAG CAG CTT CAA SCGB3A2 291 CCA CTG GCT TGG TGG ATT T 292 TCA ACA GAA ATG CCC AGA GTT SERPINA1 293 CTT CTC CAG CTG GGC ATT 294 TGC TGT GGC AGC AGA TG SFRP2 295 CGG TCA TGT CCG CCT TC 296 GCG TTT CCA TTA TGT CGT TGT C CRABP2 297 CCC TCC TTC TAG GAT AGC G 298 AAC CCG GAA TGG GTG AT AQP4 299 AAACGGACTGATGTCACTGG 300 TGGACAGAAGACATACTCATAAAGG TMPRSS4 301 CCCACTGCTTCAGGAAAGATA 302 GTCAGACATCTTCCCTCCATTC GREM1 303 GCCGCACTGACAGTATGA 304 CAGAAGGAGCAGGACTGAAA FOXF1 305 AGC GGC GCC TCT TAT ATC 306 GCG TTG AAA GAG AAG ACA AAC T NKX2-1 307 CTA CTG CAA CGG CAA CCT 308 GGG CCA TGT TCT TGC TCA NKX2-1 309 CAG ACT CGC TCG CTC ATT T 310 CCT CCA TGC CCA CTT TCT T PIP 311 CCCAAGTCAGIACGTCCAAAT 312 GCCTAATTCCCGAATAACATCAAC AGR2 313 GCT TTA AAG AAA GTG TTT GCT G 314 CTG TAT CTG CAG GTT CGT AAG SOX10 315 AAG TTC CCC GTG TGC ATC 316 CTC AGC CTC CTC GAT GAA MAGEA6 317 GTGAGGAGGCAAGGTTCTC 318 GGCTCCAGAGAGGGTAGTT TFAP2C 319 TTTGGATTTACCGCTTGGG 320 GACTCCAGTGTGGGAGAG PRAME 321 CTTCGCGGTGTGGTGAA 322 GCTGTGTCTCCCGTCAAA GPR143 323 ATC CTG CTG TAT CAC ATC ATG 324 CTG ACA GGT TTC AAA GAA CCT PMEL 325 CCAGTGCCTTTGGTTGCT 326 CAAGAGCCAGATGGGCAAG MLANA 327 TCCCAAGAGAAGATGCTCAC 328 CATTGAGTGCCAACATGAAGAC PTPRZQ 329 AAG AAG CTG CCA ATA GGG AT 330 TGT CCA GAG AGG TGG ATG Multiplex Digital Analysis of Gene Transcripts from CTC-Chip Products

To improve the detection of tumor-specific mRNA from minimal amounts of RNA derived from CTCs, we established a multiplex assay capable of testing many different gene transcripts from a minute amount of CTC-Chip product. This combines the higher sensitivity/specificity of using multiple independent genes, with the fact that the amount of input template is limited (and hence should not be diluted into multiple reactions). Our assay includes 4 genes per reaction, with each gene being resolved uniquely in 2-dimensional space by selecting different ratios of fluorescent conjugated primers. Thus, in a single reaction, we can independently measure 4 gene transcripts without having to dilute the template. For different cancers, we have gone as far as up to 4 different reactions (i.e., up to 20 different gene transcripts), and with application of nested RT-PCR digital assays, there is no limit to the number of reactions that can be performed.

This multiplex strategy achieves the ideal balance between analyzing multiple transcripts (and hence ensuring against heterogeneous variation in cancer cell expression patterns), but not diluting the input material by performing multiple independent PCR reactions. Depending on tumor types and the number of genes required for optimal signal, we have developed assays ranging from 2-4 multiplex reactions (each multiplex reaction testing for 4-genes). Thus, without undue dilution of input template, we can interrogate the product of a single CTC for expression of anywhere from 8 to 16 different genes. It is important to the assay to be able to add the signal from all of these genes (i.e. cumulative signal), while also having individual gene results (to optimize signal/noise at the individual gene level, and also gather information from specific signaling pathways that each gene interrogates—for example androgen signaling in prostate CTCs).

To display the results of the multiplex reaction in a single view (and hence differentiate amplification of each gene is isolation), we varied the concentrations of the two fluorescent probes (FAM (blue) and HEX (green)). By doing this, each individual gene amplification reaction has a unique combination of FAM/HEX signal that reflects the composition of the gene-specific primers, and hence identifies the gene-specific PCR product. In 2-dimensional space, we can illustrate the signal position of 4 different gene amplification products produced from a single multiplex reaction. As applied to digital PCR using droplets to encapsulate each PCR reaction, this method separates the targets into individual clusters by modifying the binary signal amplitude of positive droplets, which are displayed quantitatively. As predicted, this method allows both cumulative scoring of total signal for multiple genes (e.g., 16 markers in a total of 4 reactions), while also retaining the ability to quantify the signal from each individual gene target.

Specific results of testing are detailed in the examples below.

Applications of the d-CTC Assay Methods

The early detection of epithelial cancers at a time when they can be surgically resected or irradiated provides the best chance of cure, and the administration of adjuvant chemotherapy in the setting of minimal cancer dissemination is far more effective in achieving cure than the treatment of established metastatic disease. However, current efforts at early cancer detection suffer from lack of specificity. For instance widespread screening of men for prostate cancer, using serum PSA measurements is effective in uncovering early cancers, but it also identifies a much larger number of non-malignant prostate conditions (e.g., benign hypertrophy of the gland) or even cancers that are indolent and never destined to become invasive. As such, broad PSA screening is not recommended by public health organizations, because the number of complications (including deaths) from over-diagnosis match or even outweigh the calculated benefit in early cancer detection.

For other cancers, such as breast cancer, mammography is considered effective, but even then a large number of breast biopsies are performed to diagnose each true malignancy. For lung cancer, the recently recommended low dose CT scanning of individuals with a heavy cigarette smoking history is also likely to detect hundreds of innocent radiographic abnormalities for each true malignancy.

It is in this context that the addition of a blood-based ultra-sensitive readout for the presence of cancer cell-derived signatures would provide the required specificity. The d-CTC assays described herein can be used for both initial screening and as a confirmation of earlier screenings at a later time. For example, in some cases the assays can be used as a second-line test to validate a highly sensitive, but nonspecific screening test (e.g., PSA in prostate cancer). In other settings for which a cancer is highly lethal, but no screening approach currently exists (e.g., pancreatic cancer), routine periodic blood screening using the assays described herein may become the norm to monitor a patient's status or condition over time.

The new d-CTC readouts are also highly relevant to the serial monitoring of patients, e.g., seemingly healthy patients with a family history and/or genetic markers of a specific type of cancer, or patients with advanced or metastatic cancer. Imaging of CTCs is expensive and relatively insensitive, in that intact cells that stain appropriately for all required markers produce a single signal. The use of the new d-CTC assays described herein, in which each CTC (no matter how intact or pre-apoptotic) can give rise to hundreds of molecular signals, dramatically enhances the ability to detect and monitor CTCs in patients with known cancer, and to quantitatively monitor and analyze their response to therapeutic interventions. Beyond scoring for cell numbers through molecular markers, specific interrogation of mutations or cancer-associated rearrangements (e.g., EML4-ALK in lung cancer) can be achieved with comparable sensitivity.

In addition to providing a digital (quantitative) measure of CTCs present within a blood sample, the new d-CTC assay also allows analysis of specific signaling pathways that are unique to the tumor cells in the blood. For instance, a subset of prostate lineage-specific genes are driven by androgen signaling (such as PSA), while another subset are repressed by androgen signaling (such as PSMA). By analyzing these genes together, we can ascertain the status of androgen signaling within CTCs. Similarly, in breast cancer, expression of estrogen-responsive genes (such as progesterone receptor) provides a measure of the status of the estrogen-responsive pathway within CTCs. These measurements are particularly important in that therapeutic interventions in both prostate and breast cancers are derived to target the androgen and estrogen receptors, respectively. Thus, defining the total number of CTC signal in the blood, simultaneously with information about the effectiveness of the therapeutic agent in targeting and shutting off the critical pathway is important for therapeutic monitoring.

As discussed in the examples below, the new methods described herein are illustrated in prostate cancer, where the anti-androgenic agent abiratorone (e.g., ZYTIGA®) is effective in suppressing cancer progression, particularly in tumors that are still dependent on the androgen pathway.

EXAMPLES

The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.

Example 1 Preliminary Testing and Validation of the Digital CTC Assay

To test the feasibility of CTC-Chip-Droplet assay, we first selected several transcripts that are specifically expressed in prostate tumor cells, but are absent in contaminating leukocytes. These were the prostate lineage specific markers KLK3 (kallikrein-related peptidase; aka Prostate Specific Antigen, or PSA), FOLH1 (Folate Hydrolase; aka Prostate Specific Membrane Antigen, or PSMA) and AMACR (alpha-methylacyl-CoA racemase), as well as EpCAM (Epithelial Cell Adhesion Molecule). PCR conditions were optimized using intron-spanning primers and ZEN double-quenched FAM-labelled probes from Integrated DNA Technologies (Coralville, Iowa) following standard qPCR protocols. These conditions were first tested with encapsulated cDNA from admixtures of cancer cells and leukocytes in order to explore the dynamic range of the system. Next, using manual isolation techniques for individually selecting cells, 0, 3, 6, 12, 25, and 125 prostate cancer LNCaP cells were progressively spiked into individual 5 ml aliquots of HD blood, followed by CTC-iChip processing, RT-PCR and droplet encapsulation using the RainDrop system. We chose KLK3 as the target transcript for this experiment as it is predicted to be modestly abundant. Using an intensity threshold of 5,000, we found that as few as 3 cells worth of KLK3 transcript were readily detected at approximately 250 droplets.

Based on these preliminary data, we tested the CTC-Chip Droplet assay in patients with metastatic and localized prostate cancer versus healthy controls. Each sample was run through the iChip, then CTC-containing product was run through droplet RT-PCR using the four prostate markers mentioned above: KLK3, AMACR, FOLH1 and EpCAM. Patients with either local or metastatic prostate cancer produced significantly higher positive droplet counts as compared to HD controls.

FIG. 1A shows cDNA dilutions prepared from total RNA of LNCaP prostate cancer cells, mixed with leukocytes and analyzed by droplet PCR using two different prostate primer sets. The results represent several purities and show good response of positive droplet number across this range.

FIG. 1B shows manually isolated LNCaP cells spiked into HD blood samples, run through the iChip, and subjected to droplet RT-PCR (KLK3 primer set). The results show excellent sensitivity down to low numbers of target cells.

FIG. 1C shows the analysis of blood samples from healthy controls, patients with localized (resectable) prostate cancer and metastatic prostate cancer, processed through the CTC-iChip, subjected to RT-PCR and droplet analysis using three prostate-specific and one epithelial-specific biomarkers (KLK3, AMACR, FOLH1, EpCAM). The results are shown for the total number of droplets/ml for all four markers combined.

These results suggest that the application of a droplet-based PCR readout to the CTC-iChip greatly enhances its sensitivity in detecting virtually all CTCs present in a biological specimen. Taken together, the CTC-iChip and Droplet-PCR represent two powerful microfluidic technologies that are highly compatible with each other and can be integrated in-line to create a new and highly sensitive and accurate biological assay.

Example 2 Digital CTC Assay Protocol

This example provides a general digital CTC assay protocol that can be used for the methods described herein. Different aspects of this general protocol were used in some of the Examples described herein. For example, Approach 1 of Step 3 of the protocol described below (relating to RNA purification to cDNA synthesis), was used to generate data for FIGS. 15A to 15C. Approach 2 in Step 3 was used to generate data for FIGS. 19A to 24B.

1. Patient blood is run through I-Chip, version 1.3M or 1.4.5 T. Sample is collected in a 15 mL conical tube on ice.

2. Sample is spun down at 4 C. Supernatant is decanted and SUPERase™ In (DTT independent RNAse inhibitor)+RNALater® Stabilization Solution (prevents RNA degradation by inhibiting RNAses) is added to the pellet. Sample is flash frozen and placed at -80 until further processing. Samples are stable at −80.

3. There are two different processing protocols for RNA purification to cDNA synthesis that were used in the examples described below.

Approach 1

-   -   a. Sample was thawed on ice.     -   b. Direct lysis of sample using detergents (NP40, Tween20).     -   c. Lysed sample was taken straight for cDNA synthesis         (Superscript III).     -   d. After cDNA synthesis sample was purified via SPRI (Agencourt         AMPure® XP beads) clean-up to clean up detergents and any         nucleotides <100 bps.

Approach 2

-   -   a. Sample was thawed on ice.     -   b. Sample was processed on RNeasy Qiagen Micro Kit. Protocol has         some slight variations compared to traditional Qiagen         recommendations. Higher volumes of Buffer RLT (Lysis buffer)         were used as well as higher ETOH concentrations. These         modifications were made because of RNALater® addition to the         sample.     -   c. After cDNA synthesis—sample was purified via SPRI (Agencourt         AMPure XP beads) clean-up to clean up detergents and any         nucleotides<100 bps.

4. cDNA (synthesized from Approach 1 or 2) can be processed in two different ways:

-   -   a. cDNA was used directly for ddPCR; or     -   b. cDNA was amplified used a Fluidigm BioMark™ Nested PCR         approach (primers from genes used for nested PCR have been         pre-validated). Amplified cDNA was diluted.

5. cDNA (from step 4a or 4b), Biorad Supermix™ for probes, primer or primers (for gene of interest; up to 4 different primers (FAM and HEX) can be multiplexed) were added in a total volume of 22 μl.

6. Droplets were generated (˜15,000-18,000 droplets per well).

7. Droplet Sample were put in a PCR machine. The PCR conditions were different than Biorad recommendations. We used a step-down rather than a slow ramp to ensure that all droplets reach the same temperature. This is different than what both RainDance and Biorad uses. Better results (i.e., more signal and more separation between positive and negative droplets) can be obtained with the step-down rather than the gradient.

8. After the PCR, positive droplets were counted in a ddPCR machine.

9. Data is collected and analyzed using TIBCO® Spotfire® analysis software.

The reagents, reagent concentrations, and reaction volumes are provided below:

Reagents:

-   -   Biorad ddPCR™ Supermix for Probes (No dUTP)     -   IDT primers/probes (20× or 40×)     -   cDNA (1 ng/ul for cell lines)     -   Nuclease free water     -   Eppendorf semi-skirted 96 well plate (Only these plates work         with the machine)

Testing Relevant Cell Lines

Per Single Reaction:

ddPCR Supermix 11.0 μl Primer (20x) 1.10 μl cDNA (1 ng/μl) 1.10 μl Water 8.80 μl TOTAL 22.0 μl per well A master-mix containing ddPCR supermix, cDNA, and water were aliquoted into wells and 1.1 μl of each the primer was added to each well and mixed well.

Patient Samples

Per single reaction for Individual Genes

ddPCR Supermix 11.0 μl Primer (20x) 1.1 μl cDNA (patient) Up to 9.9 μl (Balance with water if less) TOTAL 22.0 μl per well

Per Single Multiplexed Reaction for Multiple Genes

ddPCR Supermix 11.0 μl Primer 1 (40x) .55 μl Primer 2 (40x) .55 μl Primer 3 (40x) .55 μl Primer 4 (40x) .55 μl cDNA (patient) 8.8 μl TOTAL 22.0 μl per well

When testing multiple patients against a gene-specific primer or multiplexing primers against multiple genes, a master-mix, which includes the ddPCR supermix and primers, was aliquoted into wells followed by addition of patient cDNA to each well and mixed well.

Example 3 Protocol for Gene Validation

The following protocol was used for selecting the specific marker genes listed in Table 1.

-   -   1. Transcripts that are unique to CTCs and not expressed in         white blood ccells (WBCs), leukocytes, etc. were mined         bioinformatically—Primary tumor and CTC gene expression data was         compared to WBC gene expression datasets to isolate transcripts         that were present only in primary tumor and/or CTCs.     -   2. Transcripts that passed a threshold cutoff were validated by         qPCR.     -   3. Primers were synthesized by IDT. Probes were labeled with         FAM/ZEN/IBFQ.     -   4. qPCR validation required that every transcript be validated         by at least two independent primer sets on two different cell         lines, 5 healthy donors WBCs (isolated via CPT column) and water         as a negative control. 50 cycles for qPCR were used to confirm         that expression of a transcript was only present in cell lines         and not in healthy donors.     -   5. Transcripts that passed qPCR validation were validated on         ddPCR with cell lines and healthy donors passed through the         CTC-iChip (with and without cell spiking).     -   6. Panels of transcripts were multiplexed (up to 4 different         genes per reaction) depending on disease of interest.

The validity of this strategy is shown below in a spiked cell experiment, in which a carefully measured number of tumor cells (from the LNCAP prostate cancer cell line) are individually micro-manipulated, added to control blood specimens, passed through the CTC-iChip and then analyzed by d-CTC assay as above. Increasing numbers of spiked cells show increasing numbers of digital signal as shown in FIG. 2, which illustrates the power of this protocol. FIG. 2 demonstrates the use of a single gene transcript (KLK3, also known as PSA, for prostate cancer) as a probe (in the assay, we use from 8-24 gene transcripts, thereby further increasing sensitivity). Here, we spike a calculated number of cancer cells (each cell is micro-manipulated, picked and introduced into 10 ml of control blood specimen). The blood is then processed through the CTC-Chip and subjected to digital readout as described above. No signal is observed in blood that has not been spiked with a single cancer cell. Introduction of 2 cells/10 ml of blood generates clear signal (65 positive droplets). In this case, the 10 CTC product was divided into 4 and run in quadruplicate, so the 64 droplets actually represent the digital signal derived from ¼ of a tumor cell.

This assay is both highly sensitive and reproducible. As shown in FIG. 3, the digital signal in these spiked cell experiments shows high reproducibility (2 independent replicates shown here), and the same amount of signal is seen when cells are spiked into buffer (rather than blood) and directly analyzed (without CTC-Chip processing). Thus, there is virtually no loss of signal when a tumor cell is diluted into billions of normal blood cells and then “re-isolated” using the CTC-Chip prior to digital readout.

Example 4 Multiplex Digital Analysis of Gene Transcripts from CTC-Chip Product

We established a multiplex assay capable of testing many different gene transcripts from a minute amount of CTC-Chip product. This combined the higher sensitivity and specificity of using multiple independent genes, with the fact that the amount of input template is limited (and hence should not be diluted into multiple reactions). The new assays include multiple genes, e.g., 2, 3, 4, 6, 8, 10, or more genes per reaction, with each gene being resolved uniquely in 2-dimensional space by selecting different ratios of fluorescent conjugated primers. Thus, in a single reaction, one can independently measure 2, 3, 4, or more gene transcripts without having to dilute the template. For different cancers, one can run and analyze multiple different reactions (e.g., up to 20 different gene transcripts in four runs), and with application of nested RT-PCR digital assays, there is no limit to the number of reactions that can be performed.

To display the results of the multiplex reaction in a single view (and hence differentiate amplification of each gene is isolation), we varied the concentrations of the two fluorescent probes (FAM and HEX). By doing this, each individual gene amplification reaction has a unique combination of FAM/HEX signal that reflects the composition of the gene-specific primers, and hence identifies the gene-specific PCR product. In 2-dimensional space, we can illustrate the signal position of 4 different gene amplification products produced from a single multiplex reaction. As applied to digital PCR using droplets to encapsulate each PCR reaction, this method separates the targets into individual clusters by modifying the binary signal amplitude of positive droplets, which are displayed quantitatively. As predicted, this method allows both cumulative scoring of total signal for multiple genes (e.g., 16 markers in a total of 4 reactions), while also retaining the ability to quantify the signal from each individual gene target.

Probe 1: 100% FAM

Probe 2: 100% HEX

Probe 3: Mixture of FAM and HEX—sum up to 100%

Probe 4: Mixture of FAM and HEX—sum up to 100%

As shown in Tables 3 to 7, the following probe mixtures were used in the multiplex reactions:

TABLE 3 Multiplexing primers against 4 genes per reaction (Melanoma) FAM HEX Primer FAM hit HEX int Reaction 1 100% 0 Sox10 6000 0  70%  30% SFRP1 4000 2500  30%  70% RND3 4500 5500  0% 100% TFAP2C 0 6000 Reaction 2 100% 0 PRAME 11000 0  70%  30% MLANA 8000 4000  30%  70% MAGEA6 5000 6000  0% 100% PMEL 0 5500 Reaction 3 100% 0 PMEL 7000 0  70%  30% MLANA 6000 3000  30%  70% MAGEA6 4000 5000  0% 100% MET 0 4500

TABLE 4 Multiplexing primers against 4 genes per reaction (Pan-Cancer/lineage) Exp. FAM Exp. HEX FAM HEX Primer Int Int 100 0 TFAP2C 9000 0 60 40 PGR 5100 1800 35 65 SCGB2A1 2205 7800 0 100 CADPS2 0 5000

TABLE 5 Multiplexing primers against multiple genes per reaction (AR status in Prostate) Multiplexing primers against 4 genes per reaction (Prostate) Exp. FAM Exp. HEX FAM HEX Primer Int Int Reaction 1 100 0 TMPR2 5500 0 65 35 FAT1 5525 1837.5 40 60 KLK2 2440 2580 0 100 STEAP2 0 4300 Reaction 2 100 0 KLK3 6600 0 70 30 HOXB13 4340 1320 50 50 AGR2 4050 3050 0 100 FOLH1 0 5200

TABLE 6 Multiplexing primers against 4 genes per reaction Epithelial- Mesenchymal Transition (EMT) Exp. FAM Exp. HEX FAM HEX Primer Int Int Reaction 1 100 0 PKP3 8000 0 75 25 OCLN 6000 1625 40 60 CDH11 4000 3600 0 100 S100A2 0 5000 Reaction 2 100 0 FAT1 8000 0 65 35 FAT2 5200 1750 40 60 COL8A1 3200 3900 0 100 CDH3 0 6000

TABLE 7 Multiplexing primers against multiple genes per reaction Avg Avg Gene-primer intensity intensity AR Reaction set (FAM) (HEX) Status 1 TMPRSS2 5500 ON 1 FAT1 8500 5250 ? 1 KLK2 6100 4300 ON 1 STEAP2 3350 4300 ON 2 KLK3 6600 ON 2 FOLH1 6200 5200 OFF 2 AGR2 8100 6100 OFF 2 HOXB13 6500 4400 OFF

Validation and Testing

To validate and demonstrate the effectiveness of this multiplex strategy, we illustrated both the concept (using spiked cell experiments) and patient-derived samples. FIG. 4 shows the results of processing a normal control blood sample from a healthy donor (HD) through the CTC-Chip and subjected to d-CTC assay for 4 different gene transcripts, all of which are negative (i.e., blank droplets).

On the other hand, FIG. 5 is a representation of data from spiked cell experiments, prostate cancer cell lines introduced into blood and processed through the CTC-Chip, followed by digital assay, showed positive signal (fluorescent droplets) for each of the 4 lineage transcripts. These appeared at separate locations within the 2-Dimensional plot, based on differential fluorescence of two probes (color coded in picture). As the sample is overloaded with tumor cells, some droplets contained signal from more than one gene transcript (multiple genes per droplet are shown in gray).

The strategy of representing four different genes within each reaction was applicable to multiple different cancers, with specific lineage markers substituted for each tumor type. For instance, in prostate cancer, we predicted (theoretical model) a multiplex reaction with four quadrants (one gene per quadrant) for each of 2 reactions (total of 8 gene markers). The spiked cell experiment (prostate cancer cells introduced into control blood and processed through the CTC-iChip) precisely recapitulated the predicted results.

Furthermore, FIGS. 6A-6B and FIGS. 7A-7B show that when assembled together, our analytic program integrated all positive signals within quadrants, just as predicted from modeling, and allowing us to develop methods to score the specific gene signals. Multi-dimensional space analysis of signal allowed for automated analysis and scoring with high level accuracy. FIGS. 6A and 6B show the theoretical model and actual results, respectively, for a prostate cancer cell line for Reaction 1, and FIGS. 7A and 7B show the theoretical model and actual results, respectively, for the same prostate cancer cell line for Reaction 2.

FIGS. 8A-8B (breast and lung cancer theoretical and actual results, Reaction 1), 9A-9B (breast and lung cancer theoretical and actual results, Reaction 2), 10A-10B (same, Reaction 3), 11A-11B (same Reaction 4), 12A-12B (same, Reaction 5), and 13A-13B (same, Reaction 6) illustrate the results when the same approach was use with breast cancer and lung cancer. We can establish a multi-cancer panel that is effective in identifying markers shared by most adenocarcinomas (i.e., grouping breast and lung cancer togher), as 6 reactions (4 gene markers within each reaction for a total of 24 markers), as shown below (theoretical vs validation using spiked cell experiments with both breast and lung cancer cells).

These figures show the results when the same approach of testing multiple gene transcripts in multiplex fashion (4 genes per reaction) was applied to breast cancer. Six different reactions were performed of the same CTC chip product (enabling a total of 24 gene transcripts to be tested independently), with each one having a designated signal position (predicted in upper panel) and observed in spiked cell validation experiments (observed in lower panel).

Example 5 Target-Specific Pre-Amplification to Improve Detection of Tumor-Specific mRNA

To improve the detection of tumor specific RNAs, a nested PCR strategy was optimized for each of the gene-specific amplifications. To achieve this, cDNA derived from the CTCs was first amplified with gene-specific primers which are situated a few base pairs external to the gene-specific primers used for d-CTC assay. For each gene, two to three primer sets were tested, and the primer set that is compatible with the gene-specific d-CTC assay primer and tests negative in HD blood was chosen for analysis of patient samples.

As described above, the target specific amplification protocol was first tested in cell lines derived from the different cancers. The primer combinations that are specific for tumor cells (and absent in leukocytes) were then tested with a mixture of cancer cell lines mixed into blood and enriched through the CTC-iChip. HD blood processed through the CTC-iChip was used as control. Key to this strategy is the design of the nested PCR conditions to enhance the signal from minute amounts of CTC-derived cDNAs, without increasing the minimal baseline signal from normal blood cells. This selectivity was achieved by careful optimizing of PCR primer sequences and assay conditions, as well as balancing the cycle number for the external and internal PCRs. All conditions are validated first with purified nucleic acids, then with individual tumor cells that are spiked into control blood samples and processed through the CTC-iChip, then with large panels (>10) of different healthy blood donors, and ultimately with patient-derived blood samples from patients who have either metastatic or localized cancers of the prostate, breast, melanoma, liver, lung or pancreas.

Reagents

-   -   DNA Suspension Buffer (10 mM Tris, pH 8.0, 0.1 mM EDTA)         (TEKnova, PN T0221)     -   0.5 EDTA, pH 8.0 (Invitrogen, PN Am9260G)     -   TaqMan PreAmp Master Mix (Applied Biosystems, PN 4391128)     -   Nuclease-free Water (TEKnova, PN W330)

Preparing 10X Specific Target Amplification (STA) Primer Mix

-   -   1.) In a DNA-free hood, 0.5 μL of each of 200 μM primer pairs         (0.5 μL Forward primer and 0.5 μL Reverse primer) were mixed.     -   2.) Each primer was diluted in 1× DNA Suspension Buffer to a         final concentration of 500 nM. (Ex: If pooled primer volume         equals 8 mL, add 192 mL DNA Suspension Buffer)     -   3.) The mix was vortexed for 20 seconds and spun down for 30         seconds.     -   4.) 10× STA Primer Mix can be stored at 4° C. for repeated use         for up to six months or stored frozen at −20° C. for long-term         usage.

Preparing STA Reaction Mix

-   -   1.) For each well of a 96-well PCR plate, prepare the following         mix.

96 Samples with Component Per 9 μL Sample (μL) overage (μL) TaqMan ® PreAmp 7.5 780.0 Master Mix 10X STA Primer Mix 1.5 156.0 (500 nM) 0.5M EDTA, pH 8.0 0.075 7.8 Total Volume 9.0 943.8

-   -   2.) 6 μL cDNA was added to 9 μL STA reaction mix     -   3.) Thermocycling conditions listed below were used with 18         cycles of denaturation and annealing/extension steps rather than         20 cycles. (Note: 18 cycles were used to compare TSA         Pre-Amplification protocol to Whole Transcriptome         Amplification).

10 to 18 Cycles Enzyme Annealing/ Condition Activation Denaturation Extension Hold Temperature 95° C. 96° C. 60° C. 4° C. Time 10 minutes 5 seconds 4 minutes Infinity

1 μl of the pre-amplified product is loaded in each droplet PCR reaction.

FIG. 14 shows the droplet PCR signal for 7 markers (PIP, PRAME, RND3, PKP3, FAT1, S100A2, and AGR2) from 1 ng of non-amplified cell-line cDNA and from 1 μl of pre-amplified product after 10, 14, and 18 cycles of pre-amplification. Additional cycles of pre-amplification result in signal increase. Of note, PRAME, a marker expressed at very low levels in this cell line is detected only after 18 cycles of pre-amplification, demonstrating the utility of the technique.

Example 6 Clinical Data and Assay Validation

The assays described herein have been validated using actual patients samples from clinical studies. These include patients with metastatic cancer (lung, breast, prostate and melanoma), as well as patients with localized cancer (prostate). The assays are conducted as described in Examples 2 through 5.

FIGS. 15A, B, and C show a summary of clinical assays from patients with metastatic cancers of the lung (6 patients; FIG. 15A), breast (6 patients; FIG. 15B) and prostate (10 patients; FIG. 15C) showed that virtually all patients have positive signal, whereas healthy controls have none. In this assay, all positive scores were added (cumulative score). However, as described below, the scores can also be broken down by individual genes, as shown in FIG. 16.

FIG. 16 illustrates the cumulative analysis of data from multiple probes, and shows a positive signal in 10/11 metastatic prostate cancer patients (91% on a per patient basis) versus 0/12 (0%) of healthy controls. On a per sample basis, 24 of 28 samples had a positive signal, indicating an 86% detection rate. In addition, some individual markers were also fairly effective, e.g., AGR2 (9/10 detection for metastatic cancer, and 0/3 for localize cancer), TMPRSS2 (5/10 and 1/3), KLK2 (6/10 and 0/3), STEAP2 (1/10 and 1/3), FAT1 (2/10 and 1/3), and FOLH1 (3/10 and 1/3)

As illustrated above, one can also break down the individual gene markers for independent validation and quantitation, using the multiplex fluorescence color scheme described above. In this example below, a patient with metastatic prostate cancer had multiple positive markers, a patient with localized prostate cancer has a smaller number of positive scores within fewer markers, and a healthy control is negative for all markers.

FIG. 17 shows clinical data from three representative patient samples. In two separate reactions with four gene transcripts each (8 probes total), a blood sample from a patient with metastatic prostate cancer showed multiple signals (all probes are positive to various degrees). In contrast, a blood sample from a patient with localized (curable) prostate cancer showed weaker (but clearly detectable) signal. Whereas probes 1 (TMPRSS2), 5 (KLK3), 6 (HOXB13), 7 (AGR2) had the strongest signal in the metastatic cancer patient, probes 2 (FAT1) and 4 (STEAP2) were most positive in the localized cancer patient. This result clearly illustrates the heterogeneity in signal among cancer cells in the blood and the importance of dissecting the differential signals within the assay. Blood from a HD control (processed identically to the cancer patient samples) had a complete absence of signal.

Example 7 Measurement of Signaling Pathways within CTCs

In addition to providing a digital (quantitative) measure of CTCs present within a blood sample, our d-CTC assay also allowed analysis of specific signaling pathways that are unique to the tumor cells in the blood. For instance, a subset of prostate lineage-specific genes were driven by androgen signaling (such as PSA), while another subset was repressed by androgen signaling (such as PSMA). By analyzing these genes together, we can ascertain the status of androgen signaling within CTCs. Defining the total number of CTC signal in the blood, simultaneously with information about the effectiveness of the therapeutic agent in targeting and shutting off the critical pathway is important for therapeutic monitoring.

We have illustrated this concept in prostate cancer, where the anti-androgenic agent abiratorone is effective in suppressing cancer progression, particularly in tumors that are still dependent on the androgen pathway. Below, we showed the results of a patient with “Castrate Resistant Prostate Cancer (CRPC)” who is no longer responding to first line leuprolide and was treated with abiratorone. The androgen response markers (green) were initially suppressed by the therapy as it shows initial efficacy, but subsequently returned as the tumor becomes resistant and the patient experiences disease progression on this drug.

FIG. 18 provides the results of a clinical study of a patient with metastatic prostate cancer. The subset of signals from “androgen receptor-induced genes (AR-On)” is shown in green at the top of the bars in this bar graph, while the subset of signals from “androgen-repressed genes (AR-Off) is shown in red at the bottom of each bar. As the patient is treated with the androgen pathway inhibitor abiratorone (e.g., ZYTIGA® (abiraterone acetate), the AR-On signal is greatly reduced, indicating effective suppression of the androgen pathway within cancer cells in the blood. By cycle 4 of drug treatment, however, the androgen pathway appears to be reactivated in cancer cells (increasing green signal), indicative of drug resistance. Serum PSA measurements taken at these time points are consistent with failure of drug treatment.

Example 8 Non-Specific Pre-Amplification to Improve Detection of Tumor-Specific mRNA

Similar to Example 5, non-specific whole transcriptome amplification (WTA) can be used to increase the detection rate of CTC-specific transcripts. This method relies on the use of random primers that amplify not only the targets of interest but all messages found in the product. In this example, the SMARTer™ Ultra Low RNA kit protocol (Clontech) was used as described below:

Transfer RNA to PCR Tubes or Plate

-   -   1) Add 1 uL of 1:50,000 diluted ERCC Spike-In Mix 1 to each         sample     -   2) Bring the volume of each sample up to 10 uL     -   3) Add 1 uL of 3′ SMART CDS Primer IIA to each sample     -   4) Run “72 C” thermocycler program:     -   72° C. 3 min     -   4° C. forever         First Strand cDNA Master Mix (FSM):     -   1×4 uL 5× First-Strand Buffer     -   0.5 uL DTT     -   1 uL dNTP Mix     -   1 uL SMARTer IIA Oligonucleotide     -   0.5 uL RNase Inhibitor     -   2 uL SMARTScribe RT     -   9 uL per sample     -   5) Prepare the 10% excess FSM for your sample number, then add 9         uL of FSM to each sample and pipet to mix     -   6) Run “cDNA” thermocycler program:     -   42° C. 90 min     -   70° C. 10 min     -   4° C. forever

Second Strand Synthesis and Amplification (SSM):

-   -   1×25 uL 2× SeqAmp PCR Buffer     -   1 uL Primer IIA-v3     -   1 uL SeqAmp DNA Polymerase     -   3 uL Nuclease-free water 30 uL per sample     -   7) Prepare the 10% excess SSM for your sample number, then add         30 uL of SSM to each sample and pipet to mix     -   8) Run “PCR” thermocycler program:     -   95° C. 1 min     -   X cycles     -   98° C. 10 sec     -   65° C. 30 sec     -   68° C. 3 min     -   72° C. 10 min     -   4° C. forever         The number of cycles can be adjusted depending on RNA input         (e.g., 18 cycles for single cells or 9 cycles for 10 ng of RNA         input). In addition, the 4 degree stopping point is overnight.

Solid Phase Reversible Immobilization (SPRI) Purification:

Transfer PCR product to lo-bind 1.5 mL Eppendorfs and label a second set of tubes with sample IDs; run the SPRI protocol at RT until the final elution

-   -   9) Incubate AMPureTM XP beads [4 deg] at RT for at least 30         minutes     -   10) Ensure that a sufficient amount of Elution Buffer is thawed         and at RT     -   11) Make 80% ethanol (at least 400 uL per sample)     -   12) Vortex beads well before adding 50 uL of beads to each         sample, pipetting up and down 5-10 times to mix well. Note: When         pipetting beads, it's advisable to use RPT tips for better         control of the volumes added and less residual bead binding in         the tips     -   13) Incubate samples at RT for 5 minutes     -   14) Place samples on the magnet and let sit for 5 minutes     -   15) Pipet out the supernatant (˜95 uL) without disturbing the         beads (check for brown color in the pipet tip and put back in         tube if there's a significant amount of bead loss)     -   16) Wash twice with 200 uL of 80% ethanol—do not mix or disturb         the bead pellet. Simply submerge the bead pellet in the ethanol         for 30 seconds and then remove the ethanol. Try not to let the         bead pellet dry between ethanol washes.     -   17) Air-dry the samples on the magnetic rack until the bead         pellets are no longer shiny but before they crack. Pipet off any         residual ethanol that pools at the bottom while drying (Note:         The drying time can vary greatly depending on the DNA         concentration after amplification). Single-cell level RNA inputs         generally take 3-5 minutes to dry, while other IFD product         samples have taken up to an hour.     -   18) Elute pellets in 17 uL of Elution Buffer as they begin to         crack. Remove a sample from the magnet and pipet the buffer over         the pellet repeatedly until all of the beads are in solution;         then pipet mix to fully resuspend the beads (this will work to         varying degrees for each sample). Try not to mix too vigorously         as this creates many bubbles, which tends to decrease the         attainable elution volume.     -   19) Let the resuspended samples incubate at RT for at least 2         minutes, then quick spin all of the samples.     -   20) Put the samples back on the magnetic rack for 5 minutes.     -   21) Pipet off ˜15 uL of your eluted amplified cDNA and check for         beads in the pipette tip. If beads are present, pipet the         solution back over the bead pellet and let sit for ˜1 minute         before attempting another elution. Otherwise, store in a new         lo-bind 1.5-mL Eppendorf, PCR tube, or 96-well PCR plate. Note:         If you are repeatedly getting beads in the elution product, the         only solution may be to decrease your aspiration volume to 14 uL         or lower.

This whole transcriptome amplification (WTA) approach was first tested in cell lines derived from different cancers. FIGS. 19A and 19B show three different replicates of SMARTer-preamplified cDNA (18 cycles) from a liver cancer cell line (HEPG2) analyzed with 12 probes from the liver cancer panel. As shown in FIG. 19A, while the amplification efficiency for each target region is different, it is consistent among the three replicates (WTA1, WTA2, WTA3), demonstrating the reproducibility of this approach. As shown in FIG. 19B, these methods using 18 cycles of SMARTer pre-amplification provide an increase in signal of approximately four orders of magnitude (10⁸ vs 10⁴), providing a great boost in detection.

Eample 9 Multiplexed vs. Individual Marker Assays for Liver Cancer

For each sample, 10-20 mL of blood was collected from each patient. Blood was processed within 3 hours of arrival on a CTC-iCHIP running in negative depletion mode. RNA was extracted from the product using a Qiagen RNeasy™ plus Micro kit, and 5 uL of the available 17 uL amplified using ClonTech's v3 SMARTer™ whole-transcriptome amplification (WTA) strategy. 1% of the WTA product was then loaded into each well of a digital PCR plate, and 500 nM Taqman™ primer/probe combinations used to determine the transcript concentration for each gene of interest. Transcript counts were normalized to blood volume and compared between HCC, HD, and CLD patients. HCC patients are defined as biopsy-confirmed non-resected hepatocellular carcinoma, CLD patients are patients with liver disease of varying etiologies (alcohol-mediated, HBV, HCV) who have negative ultrasound/MRI. HD are healthy donors external to the lab who donate 10-20 mL of blood.

FIGS. 20A to 20C show the total droplet numbers in 21 hepatocellular carcinoma (HCC) patients (FIG. 20A), 13 chronic liver disease (CLD) patients (FIG. 20B) and 15 healthy donors (HDs)(FIG. 20C). HCC patients show higher number of droplets compared to both CLDs and HDs, suggesting that the panel is very clean in the high risk CLD group and can be used to screen those patient for the development of liver cancer. This is an important result given the low specificity of screening methods c=for liver cancer currently available in the clinic. Among CLD patients the

American Association of Liver Disease recommends ultrasound (US) every 6 months, with a detailed algorithm dependent on the size of liver lesion detected. A prospective combined AFP gene marker-ultrasound screening in China demonstrated a 37% mortality benefit for those who were screened compared to those who were not, even when the screened population only maintained a compliance rate of 60%.

The sensitivity and specificity of each assay are dependent on the threshold values chosen to define “diseased” vs. “non-diseased,” but using 20 ug/L, the AFP gene marker has a sensitivity between 50-80% and a specificity between 80-90%. In a study using 20 ng/ml as the cut-off point, the sensitivity rose to 78.9%, although the specificity declined to 78.1% (Taketa, Alpha-fetoprotein, J. Med. Technol., 1989; 33:1380). On the other hand, the overall detection rate of the present assay was 76% when taking into account the clinical history of the patients and correcting for the ones that received curative resection or liver transplant with 100% specificity.

In addition, while all 11 markers of the liver cancer assay used herein contributed to the 76% sensitivity, the top 5 markers (AHSG, ALB, APOH, FGB and FGG) by themselves have 70% sensitivity, while the top 3 markers alone (ALB, FGB, FGG) result in 67% sensitivity. ALB alone detected 56% of the cases.

Example 10 Multiplexed vs. Individual Marker Assays for Lung Cancer

Blood samples from 8 metastatic lung patients and 8 healthy donors were processed through the CTC-chip as previously described. Samples were spun down, treated with RNAlater™ and stored at −80 C. RNA was purified and cDNA was synthesized as described. STA was performed on each sample using 6 μl cDNA and the nested primers corresponding to the probes listed in the figure. 1 μl of STA product was loaded per each droplet PCR reaction.

Droplet numbers were normalized to blood volume. As shown in FIGS. 21A and 21B, the multiplexed lung gene marker panel was able to detect 100% (8/8) metastatic lung cancer patient samples above the background of the 8 healthy donors. The sensitivity of each marker of the lung panel was also determined and the results show that SFRP had a detection rate of 8/8, FAT1 Probe 2 had a detection rate of 7/8,TMPRSS4 had a detection rate of 6/8, FOXF1 and ARG2, Probe 2 had a detection rate of 5/8, FAT1 had a detection rate of 4/8, FAT2 and AGR2 had a detection rate of 3/8, and FAT2, Probe 2 had a detection rate of 2/8.

Assays for SERPINA3 and SFRP2 indicated that SFRP2 is effective for both lung and breast cancer detection, whereas the former seems more specific for breast cancer detection, but also detects some lung cancer samples.

Example 11 Multiplexed vs. Individual Marker Assays for Breast Cancer

Blood samples from 9 metastatic breast cancer patient, 5 localized breast cancer patients, and 15 healthy donors were processed though the CTC-Chip. Products were pelleted, treated with RNAlater™ and stored at −80 C. RNA and cDNA from each sample were prepared as previously described. 6 μl cDNA from each sample was STA amplified using nested primers corresponding to the probes listed in FIG. 22 (FAT2, SCGB2A1, PGR, PRAME, TFAP2C, S100A2, FAT1, AGR2, PKP3, RND3, and PIP). Droplet numbers were normalized to blood volumes and the highest healthy donor value for each marker was subtracted from the patient sample values.

FIG. 22 shows the above-background signal for each patient. These methods detected 7/9 (78%) of metastatic samples and 2/5 (40%) of localized samples. The sensitivity of each marker alone varied from 1/14 to 6/14, with the two most relevant markers being AGR2 (6/14) and FAT1 (5/14), and the next four most relevant markers being RND3, PKP3, PRAME, and SCGB2A1 (3/14 each).

Example 12 AVR7 Detection in Metastatic Breast Cancer

Blood samples from 10 metastatic breast cancer patient and 7 healthy donors were processed though the CTC-Chip. Products were pelleted, treated with RNAlater™ and stored at −80 C. RNA and cDNA from each sample were prepared as previously described. 6 μl of non-amplified cDNA were loaded into each droplet PCR reaction. The samples were analyzed with probes against the v7 isoform of the androgen receptor (ARv7, sequence in Table 1). Droplet number was normalized to blood volume.

As shown in FIG. 23A, ARv7 was detected in 5/10 patients (50%) at above background (HD) levels, demonstrating that the assay is successful at detecting ARv7 from liquid biopsy. One of the patients had a triple negative breast cancer, suggesting utility of ARv7 as a marker even in the triple negative breast cancer (TNBC) context (e.g., patients who do not express genes for any of the three most common breast cancer markers, the estrogen receptor (ER), HER2/neu, and the progesterone receptor (PR) marker).

Example 13 Multiplexed vs. Individual Marker Assays for Melanoma

Blood samples from 34 metastatic or unresectable melanoma patients, each with multiple draw points (total draw points: 182), and 15 healthy donors were processed though the CTC-Chip. Products were pelleted, treated with RNAlater™ and flash frozen at −80 C. RNA and cDNA from each sample were prepared as previously described. 12 μl cDNA from each sample was amplified by specific target amplification (10 cycles) using nested primers corresponding to the probes listed along the bottom of the graph in FIG. 24A (individual markers PMEL, MLANA, MAGEA6, PRAME, TFAP2C, and SOX10)). Droplet numbers were normalized to blood volumes. FIG. 24B shows a dot plot distribution of droplet signals detected in melanoma patients as compared to healthy donors. The detection sensitivity was 81% for all patient draw points (a patient draw is scored positive if any 1 of 6 markers shows droplet signals above the highest background signal in HD for that particular marker). Of the individual markers, PMEL and MLANA showed the highest detection rate.

Other Embodiments

It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims. 

1. A method for analyzing circulating tumor cells (CTCs) in a blood sample with ultra-high sensitivity and specificity, the method comprising isolating from a blood sample a product comprising CTCs and other cells present in blood; isolating ribonucleic acid (RNA) molecules from the product generating cDNA molecules in solution from the isolated RNA; encapsulating cDNA molecules into individual droplets; amplifying cDNA molecules in each droplet in the presence of [a] one or more reporter groups configured to bind specifically to cDNA from CTCs and not to cDNA from other cells in the blood; detecting droplets that contain bound reporter groups as an indicator of the presence of cDNA molecules from CTCs in the droplets; and analyzing CTCs in the detected droplets.
 2. (canceled)
 3. The method of claim 1, further comprising reducing a volume of the product before isolating RNA.
 4. (canceled)
 5. The method of claim 1, wherein generating cDNA molecules from the isolated RNA comprises conducting reverse transcription (RT) polymerase chain reaction (PCR) of the isolated RNA molecules.
 6. The method of claim 1, wherein amplifying cDNA or cDNA molecules within each of the droplets comprises conducting PCR in each droplet.
 7. The method of claim 1, wherein encapsulating individual cDNA molecules further comprises encapsulating PCR reagents in individual droplets with the cDNA molecules and forming at least 1000 droplets of a non-aqueous liquid.
 8. The method of claim 1, wherein the one or more reporter groups comprise a fluorescent label.
 9. The method of claim 4, wherein removing contaminants from the cDNA-containing solution comprises the use of Solid Phase Reversible Immobilization (SPRI), comprising immobilizing cDNA in the solution with magnetic beads that are configured to specifically bind to the cDNA; removing contaminants from the solution; and eluting purified cDNA.
 10. (canceled)
 11. The method of claim 7, wherein the non-aqueous liquid comprises one or more fluorocarbons, hydrofluorocarbons, mineral oils, silicone oils, and hydrocarbon oils.
 12. The method of claim 6, wherein probes and primers for use in amplifying the cDNA molecules within each of the droplets correspond to one or more probes and primers that relate to one or more selected cancer-selective genes listed in Table
 1. 13. The method of claim 12, wherein the selected cancer-selective genes include prostate cancer-selective genes.
 14. The method of claim 12, wherein the selected cancer-selective genes include breast cancer-selective genes.
 15. The method of claim 12, wherein the selected cancer-selective genes include genes selective for one or more of lung cancer, pancreatic cancer, liver cancer, and melanoma.
 16. The method of claim 12, wherein the selected cancer-selective genes include one or more genes selective for two or more, three or more, four or more, or five or more different types of cancer.
 17. The method of claim 16, wherein the genes are selective for breast cancer and lung cancer; breast cancer, lung cancer, and liver cancer; breast cancer, lung cancer, and pancreatic cancer; breast cancer, lung cancer, and prostate cancer; breast cancer, liver cancer, and melanoma; breast cancer, lung cancer, and melanoma; breast cancer, lung cancer, liver cancer, and prostate cancer; breast cancer, lung cancer, liver cancer, and melanoma; breast cancer, lung cancer, liver cancer, and pancreatic cancer; breast cancer, lung cancer, prostate cancer, and pancreatic cancer; breast cancer, lung cancer, liver cancer, melanoma, and pancreatic cancer; or breast cancer, lung cancer, liver cancer, melanoma, pancreatic cancer, and prostate cancer.
 18. The method of claim 1, wherein the CTCs arise from metastatic or primary/localized cancers.
 19. The method of claim 1, wherein analyzing the CTCs in the detected droplets comprises monitoring CTCs from blood samples taken over time from a patient with a known cancer, and testing, imaging, or both testing and imaging the CTCs to provide a prognosis for the patient.
 20. The method of claim 1, wherein analyzing the CTCs in the detected droplets comprises testing, imaging, or testing and imaging the CTCs from a blood sample from a patient to provide an indication of a response by the CTCs to a therapeutic intervention.
 21. The method of claim 1, wherein analyzing the CTCs in the detected droplets comprises determining a number or level of CTCs per unit volume of a blood sample from a patient to provide a measure of tumor burden in the patient.
 22. The method of claim 21, further comprising using the measure of tumor burden in the patient to select a therapy.
 23. The method of claim 22, further comprising determining the measure of tumor burden in the patient at a second time point to monitor the tumor burden over time. 24-28. (canceled) 